Unicode::Digits - Convert UNICODE digits to integers you can do math with
Version 20090607
So, you have matched a string with \d and now want to do some math. What is that you say? The number your captured plus 5 is 5? Oh, that is right \d now matches UNICODE digits not [0-9]. What to do? Well, You can just call digits_to_int and all of your troubles* are over!
\d
digits_to_int
use Unicode::Digits qw/digits_to_int/; my $string = "forty-two in Mongolian is \x{1814}\x{1812}"; my $num = digits_to_int $string =~ /(\d+)/; print $num + 5, "\n";
The digits_to_int function transliterates a string of UNICODE digit characters to a number you can do math with, non-digit characters are passed through, so "42 is \x{1814}\x{1812}" becomes "42 is 42".
"42 is \x{1814}\x{1812}"
"42 is 42"
You can optionally pass an argument that controls what happens when the source string contains non-digit characters or characters from different sets of digits. ERRORHANDLING can be one of "strict", "loose", "looser", or "loosest". Their behaviours are as follows:
"strict"
"loose"
"looser"
"loosest"
All of the characters must be digit characters and they must all come from the same range (so no mixing Monglian digits with Arabic-Indic digits) or the function will die.
All of the characters must be digit characters or it will die. If there are characters from different ranges you will get a warning.
If there are any non digit characters, or the characters are from different ranges, you will get a warning.
This is the default mode, all non-digit characters are passed through witout warning, and the digits do not have to come from the same range.
Chas. J. Owens IV, <chas.owens at gmail.com>
<chas.owens at gmail.com>
digits_to_int takes one or two arguments, if you have more than two or no arguments you will recieve this error.
If you pass a second argument that is not strict, loose, looser, or loosest to digits_to_int, you will recieve this error.
You will recieve this message as a warning or error (depending on what mode you chose), if the string has characters that do not have the UNICODE digit property.
You will recieve this message as a warning or error (depending on what mode you chose), if the string has characters that are not part of the same range of digit characters.
This error is unlikely to occur, if it does then the bug is either with my code (the likely scenario) or Unicode::UCD (not very likely).
Unicode::UCD
My understanding of UNICODE is flawed, therefore, I have undoubtly done something wrong. For instance, what should be done with "5\x{0308}"? Also, there is a bunch of stuff relating to surrogates I don't understand.
You can find documentation for this module with the perldoc command.
perldoc Unicode::Digits
Copyright 2009 Chas. J. Owens IV, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Unicode::Digits, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Unicode::Digits
CPAN shell
perl -MCPAN -e shell install Unicode::Digits
For more information on module installation, please visit the detailed CPAN module installation guide.