The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

text_compare.pl - simple command-line interface to Text::Similarity

SYNOPSIS

 text_compare.pl --type Text::Similarity::Overlaps GPL.txt GPL.txt

 text_compare.pl [[--verbose] [--stoplist=FILE] [--no-normalize] --type=TYPE | --help | --version] FILE1 FILE2

DESCRIPTION

This script is a simple command-line interface to the Text::Similarity set of Perl modules. By default it returns a normalized F-measure between 0 and 1 that measures the similarity of the two files that is computed as follows:

 precision = overlap_score / length_string_2
 recall = overlap_score / length_string_1
 F-measure = 2 * precision * recall / (precision + recall)

In addition, this program can return the cosine, E-measure, precision, and recall when used in the verbose mode.

OPTIONS

--type=TYPE

The type of text similarity measure. Valid values include:

    Text::Similarity::Overlaps
--stoplist=FILE

The name of a file containing stop words (one word per line).

--no-normalize

Do not normalize scores. Normally, scores are normalized so that they range from 0 to 1. Using this option will give you a raw score instead.

--verbose

Be verbose.

--help

Show a detailed help message.

--version

Show version information.

AUTHORS

Ted Pedersen, University of Minnesota, Duluth tpederse at d.umn.edu

Jason Michelizzi, Universtiy of Minnesota, Duluth

Last modified by: $Id: text_compare.pl,v 1.8 2008/03/20 04:45:43 tpederse Exp $

BUGS

None known.

COPYRIGHT AND LICENSE

Copyright (C) 2004-2008, Jason Michelizzi and Ted Pedersen

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA