Erick Calder

NAME

String::Canonical - Creates canonical strings.

SYNOPSIS

 use String::Canonical qw/cstr/;
 print cstr("one thousand maniacs");

 print String::Canonical::get("Second tier");

DESCRIPTION

This module generates a canonical string by converting roman numerals to digits, English descriptions of numbers to digits, stripping off all accents on characters (as well as handling oe = ö, ae = æ, etc.), replacing words with symbols (e.g. and = &, plus = +, etc.) and removing common variant endings.

In short, this module generates the same signature for the following strings:

    bjørk = björk = bjoerk = bjork
    1,000 maniacs = one thousand maniacs = 1k maniacs
    Boyz II Men = Boyz To Men = Boyz 2 Men
    ACDC = AC/DC = AC-DC
    Rubin and company = Rubin & Company = Rubin & Co.
    Third Eye Blind = 3rd eye blind
    Train runnin' = Train Running

INTERFACE

The following functions may be imported into the caller package by name:

cstr/get [string = $_]

Returns the canonical form of the string passed. If no string is passed, $_ is used. When called in void context the function will set $_. The functon may also be accessed as get but only cstr may be exported.

cstr_cmp/cmp <string> [string = $_]

Compares two strings. Note that if the second string is not provided, $_ is used.

AUTHOR

Erick Calder <ecalder@cpan.org>

SUPPORT

For help and thank you notes, e-mail the author directly. To report a bug, submit a patch or add to our wishlist please visit the CPAN bug manager at: http://rt.cpan.org

AVAILABILITY

The latest version of the tarball, RPM and SRPM may always be found at: http://perl.arix.com/ Additionally the module is available from CPAN.

LICENCE AND COPYRIGHT

This utility is free and distributed under GPL, the Gnu Public License. A copy of this license was included in a file called LICENSE. If for some reason, this file was not included, please see http://www.gnu.org/licenses/ to obtain a copy of this license.

$Id: Canonical.pm,v 1.2 2003/02/15 01:44:39 ekkis Exp $

1 POD Error

The following errors were encountered while parsing the POD:

Around line 16:

Non-ASCII character seen before =encoding in 'ö,'. Assuming ISO8859-1