This class knows how to read two treebank formats, the Penn format and the Chomsky Normal Form (CNF) format. These formats differ in how they handle terminal nodes. The Penn format places pre-terminal part of speech tags in the left-hand position of ...KAHN/Lingua-Treebank-0.16 - 28 Aug 2008 20:08:52 GMT - Search in distribution
Factory class of modules for reading treebanks in different formats. The default format is the Penn Treebank format. Other supported formats are the format produced by the Berkeley parser, the Stanford parser (including typed dependencies), TigerXML ...TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT - Search in distribution
"Text::StemTagPOS" uses the modules Lingua::Stem::Snowball and Lingua::EN::Tagger to do part-of-speech tagging and stemming of English text. It was developed to pre-process text for other modules. Encoding of all text should be in Perl's internal for...KUBINA/Text-StemTagPOS-0.61 - 31 Dec 2011 13:41:21 GMT - Search in distribution
DZ Interset is a universal framework for reading, writing, converting and interpreting part-of-speech and morphosyntactic tags from multiple tagsets of many different natural languages. Individual tagsets are mapped to the Interset using specialized ...ZEMAN/Lingua-Interset-3.014 - 31 Jan 2019 13:50:27 GMT - Search in distribution
- Lingua::Interset::Atom - Atomic driver for a surface feature.
- Lingua::Interset::Tagset::HI::Conll - Driver for the Hindi tagset of the shared tasks at ICON 2009, ICON 2010 and COLING 2012, as used in the CoNLL data format.
- Lingua::Interset::Tagset::AR::Padt - Driver for the PADT 2.0 / ElixirFM Arabic positional tagset.
- 33 more results from Lingua-Interset »
Each node in analytical tree is tagged using "Lingua::EN::Tagger" (Penn Treebank POS tags). Because Lingua::EN::Tagger does its own tokenization, it checks if tokenization is same....VARISD/Treex-EN-2.20151102 - 02 Nov 2015 20:29:13 GMT - Search in distribution