NAME

Treex::Tool::Tagger::Featurama - base class for Featurama PoS taggers

VERSION

version 2.20151102

DESCRIPTION

Perl wrapper for Featurama implementation of Collins' perceptron algorithm. This class cannot be instantiated directly, you must use derived classes which override methods _get_features(), _get_feature_names() and probably also _analyze.

SYNOPSIS

 use Treex::Tool::Tagger::Featurama::SomeDerivedClass;

 my @wordforms = qw(John loves Jack);

 my $tagger = Treex::Tool::Tagger::Featurama::SomeDerivedClass->new(path => '/path/to/model');

 my ($tags_rf, $lemmas_rf) = $tagger->tag_sentence(\@wordforms);

CONSTRUCTOR

my $tagger = Treex::Tool::Tagger::Featurama->new(path = '/path/to/model');

METHODS

my ($tags_rf) = $tagger->tag_sentence(\@wordforms);

METHODS TO OVERRIDE

_analyze($wordform)

This method should provide all possible morphological analyses for the given wordform.

_get_feature_names()

This method should return an array of feature names.

_get_features($wordforms_rf, $analyses_rf_rf, $index)

This method should return an array of features, given all wordforms in the sentence, all possible morphological analyses for each of the wordforms, and a position in the sentence. Since the features may include parts of the context, it is necessary to provide the whole sentence to this function. For example:

 $featurama->_get_features(
     [qw(Time flies)],
     [[qw(NN NNP VB JJ)], [qw(VBZ NNS)]],
     0
 );
_extract_tag_and_lemma($index, $wordform)

This method should extract tag and lemma given index in sentence and wordform. It will probably want to use $self->perc TODO this will probably change

SEE ALSO

Treex::Tool::Tagger::Featurama::EN Treex::Tool::Tagger::Featurama::CS

AUTHORS

Tomáš Kraut <kraut@ufal.mff.cuni.cz>

Ondřej Dušek <odusek@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE

Copyright © 2011-2012 by Institute of Formal and Applied Linguistics, Charles University in Prague

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.