Lingua::Align::Corpus::Treebank - Factory class for reading treebanks River stage one • 1 direct dependent • 1 total dependent

Factory class of modules for reading treebanks in different formats. The default format is the Penn Treebank format. Other supported formats are the format produced by the Berkeley parser, the Stanford parser (including typed dependencies), TigerXML ...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus::Treebank::Penn - Read the Penn Treebank format River stage one • 1 direct dependent • 1 total dependent

EXPORT...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus::Treebank::Stanford - Read output from the Stanford parser River stage one • 1 direct dependent • 1 total dependent

Module to read treebanks in Penn Treebank format including dependency relations produced by the Stanford parser. Note: Adding dependency relations to the phrase-structure trees is still a bit buggy. EXPORT...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus::Treebank::TigerXML - Read the TigerXML format River stage one • 1 direct dependent • 1 total dependent

EXPORT...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus::Treebank::Berkeley - Read the output of the Berkeley parser River stage one • 1 direct dependent • 1 total dependent

EXPORT...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus::Treebank::AlpinoXML - Read Alpino XML River stage one • 1 direct dependent • 1 total dependent

EXPORT...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

convert_treebank - convert a treebank from one format to another River stage one • 1 direct dependent • 1 total dependent

This script allows you to convert a treebank to another format. The converted treebank is printed to STDOUT. Currently the following formats are supported: AlpinoXML (alpino) The XML format used by the Dutch dependency parser Alpino. Use the option [...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

treebank2moses - convert treebanks to Moses/GIZA++ format (plain text) River stage one • 1 direct dependent • 1 total dependent

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

doc::index River stage one • 1 direct dependent • 1 total dependent

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align - Perl modules for the alignment of parallel corpora River stage one • 1 direct dependent • 1 total dependent

Lingua::Align contains modules for automatic tree alignment based on discriminative classification and alignment inference. More details about the tree aligner can be found in Lingua::Align::Trees. The following gives a general overview and motivatio...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Corpus - reading corpus data River stage one • 1 direct dependent • 1 total dependent

Read corpus data in various formats. Default format = plain text, 1 sentence per line. For other types (parsed corpora etc): Use the "-type" flag....

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

Lingua::Align::Features - Feature extraction for tree alignment River stage one • 1 direct dependent • 1 total dependent

Extract features from a pair of nodes from two given syntactic trees (source and target language). The trees should be complex hash structures as produced by Lingua::Align::Corpus::Treebank::TigerXML. The returned features are given as simple key-val...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

coocfreq - count co-occurrence frequencies for arbitrary features of nodes in a parallel treebank River stage one • 1 direct dependent • 1 total dependent

This script counts frequencies and co-occurrence frequencies of source and target language features. It runs through the sentence aligned treebank and combines all node pairs. Note that co-occurrence frequencies in a sentence are " max( srcfreq(srcfe...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

sta2moses - convert from Stockholm Tree Aligner format to Moses/GIZA++ (plain text) River stage one • 1 direct dependent • 1 total dependent

This script reads through a parallel treebank using the tree alignment file (alignments.xml) and produces sentence aligned plain text files (to be used with Moses/Giza++). The corpus will be stored in alignments.src and alignments.trg....

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

treealign - training tree alignment classifiers and aligning syntactic trees River stage one • 1 direct dependent • 1 total dependent

This script allows you to train a tree alignment model and to apply them to parallel treebanks. Tree alignment is based on local binary classification and rich feature sets. Currently, training data has to be in Stockholm Tree Aligner format. The out...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

convert_bitext - a script for converting bitexts River stage one • 1 direct dependent • 1 total dependent

Convert bitexts from one format to another. There are several formats supported by Lingua::Align. Check Lingua::Align::Corpus, Lingua::Align::Corpus::Treebank for more information....

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

treealigneval - a script for computing precision and recall scores for tree aligmnent River stage one • 1 direct dependent • 1 total dependent

Both gold-standard-file and tree-alignment-file should be in Stockholm Tree Aligner Format. Here is an example: <?xml version="1.0" ?> <treealign> <head> <alignment-metadata> <date>Tue May 4 16:23:04 2010</date> <author>Lingua-Align</author> </alignm...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 GMT

17 results (0.035 seconds)