The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Benchmark::Featureset::StopwordLists - Compare various stopword list modules

Synopsis

        #!/usr/bin/env perl

        use Benchmark::Featureset::StopwordLists;

        Benchmark::Featureset::StopwordLists -> new -> run;

See scripts/stopwordlists.report.pl. This outputs HTML to STDOUT.

Hint: Redirect the output of that script to your $doc_root/stopwordlists.report.html.

A copy of the report ships in html/stopwordlists.report.html.

View this report on my website.

Description

Benchmark::Featureset::StopwordLists compares various stopword list modules.

The list of modules processed is shipped in data/module.list.ini, and can easily be edited before re-running:

        shell> scripts/copy.config.pl
        shell> scripts/stopwordlists.report.pl

The config stuff is explained below.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

The Module Itself

Install Benchmark::Featureset::StopwordLists as you would for any Perl module:

Run:

        cpanm Benchmark::Featureset::StopwordLists

or run:

        sudo cpan Benchmark::Featureset::StopwordLists

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

The Configuration File

All that remains is to tell Benchmark::Featureset::StopWwordLists your values for some options.

For that, see config/.htbenchmark.featureset.stopwordlists.conf.

If you are using Build.PL, running Build (without parameters) will run scripts/copy.config.pl, as explained next.

If you are using Makefile.PL, running make (without parameters) will also run scripts/copy.config.pl.

Either way, before editing the config file, ensure you run scripts/copy.config.pl. It will copy the config file using File::HomeDir, to a directory where the run-time code in Benchmark::Featureset::StopwordLists will look for it.

        shell>cd Benchmark-Featureset-StopwordLists-1.00
        shell>perl scripts/copy.config.pl

Under Debian, this directory will be $HOME/.perl/Benchmark-Featureset-StopwordLists/. When you run copy.config.pl, it will report where it has copied the config file to.

Check the docs for File::HomeDir to see what your operating system returns for a call to my_dist_config().

The point of this is that after the module is installed, the config file will be easily accessible and editable without needing permission to write to the directory structure in which modules are stored.

That's why File::HomeDir and Path::Class are pre-requisites for this module.

Although this is a good mechanism for modules which ship with their own config files, be advised that some CPAN tester machines run tests as users who don't have home directories, resulting in test failures.

Constructor and Initialization

new() is called as my($builder) = Benchmark::Featureset::StopwordLists -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Benchmark::Featureset::StopwordLists.

Key-value pairs in accepted in the parameter list (see corresponding methods for details):

o (None as yet)

Methods

new()

For use by subclasses.

run()

Does the real work.

See scripts/stopwordlists.report.pl and its output html/stopwordlists.report.html.

Hint: Redirect the output of that script to $doc_root/stopwordlists.report.html.

FAQ

Where is the HTML template for the report?

Templates ship in htdocs/assets/templates/benchmark/featureset/stopwordlists/.

See also htdocs/assets/css/benchmark/featureset/stopwordlists/.

How did you choose the modules to review?

By searching MetaCPAN.org for phrases like 'stopword' and 'stop word'.

See Also

The other modules in this series are Benchmark::Featureset::LocaleCountry and Benchmark::FeatureSet::SetOps.

One set of module comparison reviews, by Neil Bowers, is here.

And another set of module comparison reviews, by Ron Savage, is here.

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=Benchmark::Featureset::StopwordLists.

Repository

https://github.com/ronsavage/Benchmark-Featureset-StopwordLists.git.

Author

Benchmark::Featureset::StopwordLists was written by Ron Savage <ron@savage.net.au> in 2012.

Home page: http://savage.net.au/index.html.

Copyright

Australian copyright (c) 2012, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html