The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Name

Lingua::EN::SimilarNames::Levenshtein - Compare people first and last names.

Synopsis

    my $people = [ 
        [ 'John',     'Wayne' ], 
        [ 'Sundance', 'Kid' ], 
        [ 'Jose',     'Wales' ], 
        [ 'John',     'Wall' ], 
    ];
    
    my @people_objects = map { 
        Person->new(
            first_name => $_->[0], 
            last_name  => $_->[1],
        )
    } @{$people};
    
    # Build list of name pairs within 5 character edits of each other
    my $similar_people = SimilarNames->new(
        list_of_people   => \@people_objects, 
        maximum_distance => 5
    );
    
    # Get the people name pairs as an ArrayRef[ArrayRef[ArrayRef[Str]]]
    print Dumper $similar_people->list_of_similar_name_pairs;
    # which results in:
    [
        [ [ "Jose", "Wales" ], [ "John", "Wall" ] ],
        [ [ "Jose", "Wales" ], [ "John", "Wayne" ] ],
        [ [ "John", "Wall" ],  [ "John", "Wayne" ] ]
    ]
   

Description

Given a list of people objects, find the people whose names are within a specified edit distance.

Classes

Person

This class defines people objects with first and last name attributes.

CompareTwoNames

This class defines comparator objects. Given two Person objects, it computes the edit distance between their names.

SimilarNames

This class takes a list of Person objects and uses CompareTwoNames to generate a list of people with similar names based on an edit distance range.

One can get at the list of Person object pairs with similar name via the list_of_people_with_similar_names attribute. Alternatively, one can get at list of the names pairs themselves (no Person object) via the list_of_similar_name_pairs attribute.

Accessors

list_of_similar_name_pairs

This is called on a SimilarNames object to return a list of similar name pairs for the list of Person objects passed in. It uses the Levenshtein edit distance. This means the names are close to one another in spelling.

list_of_people_with_similar_names

This accessor is similar to the list_of_similar_name_pairs but returns a list of Person object pairs instead of the names.

Authors

Mateu X. Hunter hunter@missoula.org

Copyright

Copyright 2010, Mateu X. Hunter

License

You may distribute this code under the same terms as Perl itself.

Code Repository

http://github.com/mateu/Lingua-EN-SimilarNames-Levenshtein