nat-create - Command line tool to create NATools Corpora Objects
nat-create <file1.nat> <file2.nat> nat-create -tmx <file.tmx>
This is the basic command used to create a NATools Corpora Object from the command line.
A NATools Corpora Object is a ditectory with:
the configuration file ("nat.cnf" - metadata information)
the corpus
the corpus indexes
the probabilistic translation dictionaries ("source-target.dmp", "target-source.dmp")
the (bi,tri,tetra)grams databases ("source.ngrams", "target.ngrams")
The -tokenize flag can be used to force NATools to tokenize the texts. Note that at the moment a Portuguese tokenizer is used for all languages. This might change in the future.
-tokenize
The -id=name flag can be used to force NATools Corpora name. By default the name is read interactively.
-id=name
The -q flag can be used to force quiet mode. In thic case, the name is extracted from the file-names.
-q
The -lang=PT..EN flag can be used to force languages.
-lang=PT..EN
The -ngrams flag can be set to force NATools to create ngrams indexes.
-ngrams
The -noEM flag is used to bypass the EM-Algorithm (useful for debug purposes, mainly).
-noEM
The -ipfp flag is mutually exclusive with -noEM, -samplea and -sampleb. It defines that the EM-Algorithm to be used is the IPFP one. Optional numeric argument is the number of iterations. Defaults to 5.
-ipfp
-samplea
-sampleb
The -samplea flag is mutually exclusive with -noEM, -ipfp and -sampleb. It defines that the EM-Algorithm to be used is the Sample A one. Optional numeric argument is the number of iterations. Defaults to 10.
The -sampleb flag is mutually exclusive with -noEM, -ipfp and -samplea. It defines that the EM-Algorithm to be used is the Sample B one. Optional numeric argument is the number of iterations. Defaults to 10.
NATools documentation, perl(1)
Alberto Manuel Brandão Simões, <ambs@cpan.org>
Copyright (C) 2006-2011 by Alberto Manuel Brandão Simões
To install Lingua::NATools, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::NATools
CPAN shell
perl -MCPAN -e shell install Lingua::NATools
For more information on module installation, please visit the detailed CPAN module installation guide.