Hugo W.L. ter Doest

NAME

Candidates - Perl5 module for manipulating candidate features (help module for Statistics::MaxEntropy)

SYNOPSIS

  use Statistics::Candidates;

  # create a new candidates object and read candidate features
  $candidates = Statistics::Candidates->new($some_file);

  # checks for constant candidate features
  $candidates->check();

  # writes candidates that were not added to a file
  $candidates->write($some_other_file);

  # clear the administration about being added or not ...
  $candidates->clear();

DESCRIPTION

The Candidates object is for storage, retrieval, and manipulation of candidate features.

The reason for separating this code from Maxentropy.pm is that a set of candidate features should be considered a separate object. Blessing them into MaxEntropy would be unnatural.

Also this code is simpler and more stable than that in the MaxEntropy module.

This module requires Bit::SparseVector.

METHODS

new
 $candidates = Statistics::Candidates->new($file);
check
 $candidates->check();
write
 $candidates->write($file);
clear
 $candidates->clear();

FILE SYNTAX

The syntax of the candidate feature file is more or less the same as that for the events file:

  • each line is an event (events specified in the same order as the events file);

  • each column is a feature;

  • constant feature functions are forbidden;

  • values are 0 or 1;

  • no space between features;

  • lines that start with # are ignored.

Below is a set of candidates for m events, c candidate features; f_ij are bits:

    name_c <tab> name_c-1 ... name_2 <tab> name_1 <newline>
    f_1c ... f_13 f_12 f_11 <newline>
             .
             .
             .
    f_ic ... f_i3 f_i2 f_i1 <newline>
             .
             .
             .
    f_mc ... f_m3 f_m2 f_m1

SEE ALSO

Statistics::MaxEntropy, Statistics::SparseVector..

VERSION

Version 0.1

AUTHOR

COPYRIGHT

Statistics::Candidates comes with ABSOLUTELY NO WARRANTY and may be copied only under the terms of the GNU Library General Public License (version 2, or later), which may be found in the distribution.