Alberto Simões 🐪

Documentation

  • nat-StarDict - Creates a StarDict from a NATools corpus.
  • nat-addDict:
  • nat-codify - Command line tool to codify corpora
  • nat-compareDicts - used to compare two PTDs in Perl dumper format.
  • nat-create - Command line tool to create NATools Corpora Objects
  • nat-css - Corpus Search Sentence utility.
  • nat-dict - interface for binary PTDs operations.
  • nat-dumpDicts - Command line tool to dump NATools PTDs
  • nat-examplesExtractor:
  • nat-initmat - initialize a sparse matrix with words co-occurrence.
  • nat-ipfp - one of the three possible EM-Algorithm implementations of NATools
  • nat-lex2perl - dumps a lexicon file as Perl hash.
  • nat-makeCWB - Dumps a NATools corpus in a format suitable to be imported in CWB
  • nat-mat2dic - A translator from co-occurrence matrices to a dictionary file.
  • nat-mkMakefile - generates a pmakefile to be used by Makefile::Parallel
  • nat-mkRealDict - used to create a dictionary similar to a PTD based on a word aligned corpus.
  • nat-ngramsIdx - Indexes a ngrams SQLite file
  • nat-pair2tmx - join two files in NATools input format into a TMX file.
  • nat-postbin - A translator from dictionary file to the Perl readable format.
  • nat-pre - A pre-processor for parallel texts, counting words, checking sentence numbers, and creating auxiliary files.
  • nat-rank - classifies each parallel corpus aligned sentence
  • nat-samplea - one of the three possible EM-Algorithm implementations of NATools
  • nat-sampleb - one of the three possible EM-Algorithm implementations of NATools
  • nat-sentalign - C sentence aligner.
  • nat-sentenec-align - simple interface for Vanilla aligner.
  • nat-shell - A shell interface to NATools corpora alignment
  • nat-substDict:
  • nat-tmx2pair - splits a TMX file into several files, one for each language

Modules

Provides