Lingua::FreeLing3::Tokenizer - Interface to FreeLing3 Tokenizer
use Lingua::FreeLing3::Tokenizer; my $pt_tok = Lingua::FreeLing3::Tokenizer->new("pt"); # compute list of Lingua::FreeLing3::Word my $list_of_words = $pt_tok->tokenize("texto e mais texto."); # compute list of strings (words) my $list_of_words = $pt_tok->tokenize("texto e mais texto.", to_text => 1);
Interface to the FreeLing3 tokenizer library.
new
Object constructor. One argument is required: the languge code (Lingua::FreeLing3 will search for the tokenization data file).
Lingua::FreeLing3
Returns the tokenizer object for that language, or undef in case of failure.
tokenize
This is the only available method for the tokenizer object. It receives a string and tokenizes the text, returning a reference to a list of words.
Without any further configuration option, it will return a reference to a list of Lingua::FreeLing3::Word. The option to_text can be set, and it will return a reference to a list of strings.
to_text
Lingua::FreeLing3(3) for the documentation table of contents. The freeling library for extra information, or perl(1) itself.
Alberto Manuel Brandão Simões, <ambs@cpan.org>
Jorge Cunha Mendes <jorgecunhamendes@gmail.com>
Copyright (C) 2011 by Projecto Natura
To install Lingua::FreeLing3, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::FreeLing3
CPAN shell
perl -MCPAN -e shell install Lingua::FreeLing3
For more information on module installation, please visit the detailed CPAN module installation guide.