NAME

Text::DocumentCollection - a collection of documents

SYNOPSIS

DESCRIPTION

CLASS METHODS

new

The constructor; arguments must be passed as maps from keys to values. The key file is mandatory.

my $c = Text::DocumentCollection->new( file => 'coll.db' );

Documents from the collection are saved as in the specified file, which is currently handled by a DB_File hash.

INSTANCE METHODS

Add

Add a document to the collection, tagging it with a unique key.

$c->Add( $key, $doc );

Add dies if the key is already present.

To change an existing key, use Delete and then Add.

Delete

Discard a document from the collection.

NewFromDB

Loads the collection from the given DB file:

my $c = Text::DocumentCollection->NewFromDB( file => 'coll.db' );

The file must be either empty or created by a former invocation of new or NewFromDB, followed by any number of Add and/or Delete.

Currently, all documents in the collection are revived (by calling NewFromString). This poses performance problems for huge collections; a caching mechanism would be an option in this case.

IDF

Inverse Term frequency of a given term.

The definition we used is, given a term t, a set of documents DOC and the binary relationship has-term:

IDF(t) = log2( #DOC / #{ d in DOC | d has-term t } )

The logarithm is in base 2, since this is related to an information measurement, and # is the cardinality operator.

EnumerateV

Enumerates all the document in the collection. Called as:

my @result = $c->EnumerateV( \&Callback, 'the rock' );

The function Callback will be called on each element of the collection as:

my @l = CallBack( $c, $key, $doc, $rock );

where $rock is the second argument to Callback.

Since $c is the first argument, the callback may be an instance method of Text::DocumentCollection.

The final result is obtained by concatenating all the partial results (@l in the example above). If you do not want a result, simply return the empty list ().

There is no particular order of enumeration, so there is no particular order in which results are concatenated.

AUTHORS

spinellia@acm.org
walter@humans.net

To install Text::Document, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Text::Document

CPAN shell

perl -MCPAN -e shell
install Text::Document

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)