Roman::Unicode - Make roman numerals, using the Unicode characters for them
use Roman::Unicode qw( to_roman is_roman to_perl ); my $perl_number = to_perl( $roman ) if is_roman( $roman ); my $roman_number = to_roman( $arabic );
I made this module as a way to demonstrate various Unicode things without mixing up natural language stuff. Surprisingly, roman numerals can do quite a bit with that. You'll have to read the source to see it in action.
There are many fancy characters in this documentation, so you need a good font that has the right glyphs. The Symbola font is a good one: http://users.teilar.gr/~g1951d/
- is_roman( STRING )
Returns true if the string looks like a valid roman numeral. This works with either the ASCII version or the ones using the characters in the U+2160 to U+2188 range. You cannot mix the uppercase and lowercase numerals.
- to_perl( ROMAN )
If the argument is a valid roman numeral,
to_perlreturns the Perl number. Otherwise, it returns nothing.
- to_roman( PERL_NUMBER )
If the argument is a valid Perl number, even if it is a string,
to_romanreturns the roman numeral representation. This uses the characters in the U+2160 to U+2188 range.
If the number cannot be represented as roman numerals, this returns nothing. Note that 0 doesn't have a roman numeral representation.
If you want the lowercase version, you can use
lcon the result. However, some of the roman numerals don't have lowercase versions.
- to_ascii( ROMAN )
If the argument is a valid roman numeral, it returns an ASCII representation of it. Most of the numeral code points have compatible decompositions, so the first step uses NFKD decomposition. For other characters, it uses ASCII art representations:
Roman ASCII art ------ ---------- ↁ |) ↂ ((|)) ↈ (((|))) ↇ |))
As a demonstration of case mapping, I supply one function that uses Unicode::Casing. You can lexically override the case-mapping functions as described in that module's documentation.
A subroutine you can use with
Unicode::Casing. It's a bit more special because it turns the higher magnitude characters into ASCII versions. That means that the return value might not be a valid according to
is_roman. It returns nothing if the input isn't a valid Roman numeral string.
You can also use this as a stand-alone function instead of
lc. That's the smart way to do it, but then you don't get to play with
Perl lets you define your own properties, as documented in perlunicode. This module defines several.
IsRomanproperty is a combination of
IsUppercaseRomanproperty matches these code points:
Ⅰ U+2160 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴏɴᴇ Ⅴ U+2164 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ꜰɪᴠᴇ Ⅹ U+2169 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴛᴇɴ Ⅼ U+216C ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ꜰɪꜰᴛʏ Ⅽ U+216D ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴏɴᴇ ʜᴜɴᴅʀᴇᴅ Ⅾ U+216E ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ꜰɪᴠᴇ ʜᴜɴᴅʀᴇᴅ Ⅿ U+216F ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴏɴᴇ ᴛʜᴏᴜsᴀɴᴅ ↁ U+2181 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ꜰɪᴠᴇ ᴛʜᴏᴜsᴀɴᴅ ↂ U+2182 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴛᴇɴ ᴛʜᴏᴜsᴀɴᴅ ↇ U+2187 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ꜰɪꜰᴛʏ ᴛʜᴏᴜsᴀɴᴅ ↈ U+2188 ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴏɴᴇ ʜᴜɴᴅʀᴇᴅ ᴛʜᴏᴜsᴀɴᴅ
This excludes the other Roman numeral code points, such as Ⅻ (U+216B, ʀᴏᴍᴀɴ ɴᴜᴍᴇʀᴀʟ ᴛᴡᴇʟᴠᴇ) since they are not designed to be part of larger strings of Roman numerals.
IsLowercaseRomanis the set of lowercase code points derived from the set of code points in
IsUppercaseRoman. It checks each code point in
IsUppercaseRomanand checks the Unicode Character Database (UCD) through Unicode::UCD to see if it has a lowercase mapping. If there is a lowercase mapping, it makes it part of this property.
By using just the defined roman numerals characters in the Unicode Character Set, you're limited to numbers less than 400,000 (although you could make ↈↈↈↈ if you wanted, since that's not unheard of).
brian d foy
This module started with the Roman module, credited to:
<ozawa at aisoft.co.jp> 1995-1997
<alexchorny at gmail.com> 2007
Copyright (c) 2011, brian d foy.
You can use this module under the same terms as Perl itself.