The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

String::Compare - Compare two strings and return how much they are alike

SYNOPSIS

  use String::Compare;
  my $str1 = "J R Company";
  my $str2 = "J. R. Company";
  my $str3 = "J R Associates";
  my $points12 = compare($str1,$str2);
  my $points13 = compare($str1,$str3);
  if ($points12 > $points13) {
     print $str1." refers to ".$str2;
  } else {
     print $str1." refers to ".$str3;
  }

DESCRIPTION

This module was created when I needed to merge the information between two databases, and I had to find who were who in each database, but the names weren't always equals, sometimes there were differences.

The problem was that I need to choose the right person, so I must see how much the different names are alike. I've tried testing char by char, but situations like the described in the synopsis showed me that wasn't enough. So I created a set of tests to give a more accurate pontuation of how much the names are alike.

The result is in percentage. If the strings are exactly equal, it would return 1, if they have nothing in common, it would return 0.

METHODS

compare($str1,$str2,%tests)

This method receives the two strings and optionally the names and weights of each test. The default behavior is to use all the tests with the weigth 1. This method lowercases both strings, since case doesn't change the meaning of the content. But each test is case sensitive, so if you like you must lc the strings.

The current tests are (you can use the tests individually if you like:

P.S.: You can use custom tests, because the tests are executed using eval, so if you want a custom test, just use the full name of a method.

P.S.2: If you created a test, please share it, sending me by email and I will be glad to include it into the default set.

char_by_char($str1,$str2)

Tests character by character

consonants($str1,$str2)

Test char_by_char only in the consonants.

vowels($str1,$str2)

Test char_by_char only in the vowels.

word_by_word($str1, $str2)

Test char_by_char each word, giving points according to the size of the word.

chars_only($str1,$str2)

Test char_by_char only with the characters matched by \w.

COPYRIGHT

This module was created by "Daniel Ruoso" <daniel@ruoso.com>. It is licensed under both the GNU GPL and the Artistic License.