Lingua::Interset::Converter - Implements a converter between two physical tagsets via Interset.
version 2.043
use Lingua::Interset::Converter; my $c = new Lingua::Interset::Converter ('from' => 'cs::multext', 'to' => 'cs::pdt'); while (<CONLL_IN>) { chomp (); my @fields = split (/\t/, $_); my $source_tag = $fields[4]; $fields[4] = $c->convert ($source_tag); print (join("\t", @fields), "\n"); }
Converter is a simple class that implements Interset-based conversion of tags between two physical tagsets. It includes caching, which will improve performance when converting tags in a large corpus.
Converter
Identifier of the source tagset (composed of language code and tagset id, all lowercase, for example cs::multext). It must be provided upon construction.
cs::multext
Identifier of the target tagset (composed of language code and tagset id, all lowercase, for example cs::pdt). It must be provided upon construction.
cs::pdt
my $tag1 = convert ($tag0);
Converts tag from the source tagset to the target tagset via Interset. Tags once converted are cached so the (potentially costly) Interset decoding-encoding methods are called only once per source tag.
Lingua::Interset
Dan Zeman <zeman@ufal.mff.cuni.cz>
This software is copyright (c) 2014 by Univerzita Karlova v Praze (Charles University in Prague).
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Lingua::Interset, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::Interset
CPAN shell
perl -MCPAN -e shell install Lingua::Interset
For more information on module installation, please visit the detailed CPAN module installation guide.