The Open-CWB <http://cwb.sourceforge.net/> project makes available two modules to work with CWB version 3.0.x. Unfortunately they are not available on CPAN, and therefore it is not possible to depend on them. This ALTERNATE distribution is exactly th...
AMBS/Alt-CWB-ambs-2.2.102.6 - 16 May 2017 18:30:54 GMT - Search in distribution- CWB - Perl toolbox for the IMS Corpus Workbench
- CWB::CQP - Interact with a CQP process running in the background
- CWB::CEQL - The Common Elementary Query Language for CQP front-ends
- 6 more results from Alt-CWB-ambs »
The Open-CWB <http://cwb.sourceforge.net/> project makes available two modules to work with CWB version 3.0.x. Unfortunately they are not available on CPAN, and therefore it is not possible to depend on them. This ALTERNATE distribution is exactly th...
AMBS/Alt-CWB-CL-ambs-2.2.102.1 - 03 Oct 2015 12:50:11 GMT - Search in distribution- CWB::CL - Perl interface to the low-level Corpus Library of the IMS Open CWB
- CWB::CL::Strict - Load Perl/CL interface in strict mode
Implements a set of utility functions to access "Lingua::FreeLing3" objects. Everytime one of the accessors is used just with the language code/language data file (or using the default language), the cached processor is returned if it exists. If any ...
AMBS/Lingua-FreeLing3-0.09 - 12 Jan 2014 16:21:27 GMT - Search in distribution- Lingua::FreeLing3::Word - Interface to FreeLing3 Word object
- Lingua::FreeLing3::Sentence - Interface to FreeLing3 Sentence object
- Lingua::FreeLing3::Splitter - Interface to FreeLing3 Splitter
- 15 more results from Lingua-FreeLing3 »
"nat-ptd" supports the following commands. Most places where a PTD needs to be specified, you can use a bziped2 PTD as far as the filename ends in bz2....
AMBS/Lingua-PTD-1.16 - 20 Aug 2017 17:50:33 GMT - Search in distribution- Lingua::PTD - Module to handle PTD files in Dumper Format
- Lingua::PTD::TSV - Sub-module to export PTD to TSV
- Lingua::PTD::BzDmp - Sub-module to handle PTD bzipped files in Dumper Format
- 4 more results from Lingua-PTD »
This program encodes a pair of languages extracted from a TMX file as a CWB corpus. Optionally it can tokenize the text (using basic tokenizing rules). Accepted options are: "-from" "-to" These two parameters are useful when more than two languages a...
AMBS/XML-TMX-CWB-0.10 - 29 Apr 2015 14:10:47 GMT - Search in distribution- cwb2tmx - convert an aligned CWB corpus to TMX format
- XML::TMX::CWB - TMX interface with Open Corpus Workbench
Gives statistical information about TMX files like the number of translation units....
AMBS/XML-TMX-0.36 - 07 Sep 2017 10:36:49 GMT - Search in distribution- tsv2tmx - Create a TMX from a TSV file
- tmx2tmx - Utility to convert and filter TMX files
- tmxuniq - removes duplicated translation units from TMXs
- 11 more results from XML-TMX »
This is the basic command used to create a NATools Corpora Object from the command line. A NATools Corpora Object is a ditectory with: * the configuration file ("nat.cnf" - metadata information) * the corpus * the corpus indexes * the probabilistic t...
AMBS/Lingua-NATools-v0.7.10 - 31 Oct 2015 16:52:31 GMT - Search in distribution- nat-dict - interface for binary PTDs operations.
- nat-makeCWB - Dumps a NATools corpus in a format suitable to be imported in CWB
- nat-dumpDicts - Command line tool to dump NATools PTDs
- 17 more results from Lingua-NATools »
Config::AutoConf is intended to provide the same opportunities to Perl developers as GNU Autoconf <http://www.gnu.org/software/autoconf/> does for Shell developers. As Perl is the second most deployed language (mind: every Unix comes with Perl, sever...
REHSACK/Config-AutoConf-0.317 - 08 Jun 2018 13:48:39 GMT - Search in distributionSMASH/PLN-PT-0.008 - 28 Oct 2017 17:10:15 GMT - Search in distribution
- PLN::PT - interface for the http://pln.pt web service
Some basic tools useful when you have DB_Files with textual data and need to debug or access their contents, and you are not willing to create a script for that task....
AMBS/DB_File-Utils-0.006 - 06 Oct 2015 08:38:50 GMT - Search in distribution- DB_File::Utils - main module for db_util command line tool
App::Cmd is intended to make it easy to write complex command-line applications without having to think about most of the annoying things usually involved. For information on how to start using App::Cmd, see App::Cmd::Tutorial....
RJBS/App-Cmd-0.331 - 17 Jul 2016 19:57:27 GMT - Search in distributionSTARTING WITH VERSION 0.25, Lingua::Identify IS UNICODE BY DEFAULT! "Lingua::Identify" identifies the language a given string or file is written in. See section WHY LINGUA::IDENTIFY for a list of "Lingua::Identify"'s strong points. See section KNOWN ...
AMBS/Lingua-Identify-0.56 - 17 Aug 2013 19:25:00 GMT - Search in distribution"sentences" is a command line tool for text segmentation and annotation. It uses the "fsentences" function from "Lingua::PT::PLNbase". Its main behaviour is the detection of sentences and paragraphs, and their annotation with XML-like tags: <s> for s...
AMBS/Lingua-PT-PLNbase-0.27 - 04 Oct 2014 15:33:42 GMT - Search in distribution- Lingua::PT::PLNbase - Perl extension for NLP of the Portuguese
The files that are usually installed for each dictionary are: affix file, hash file, irregular file (if exists) and meta (yaml) file. All these files are text documents, meaning they are architecture independent. Regarding the hash file, it is langua...
AMBS/Lingua-Jspell-1.93 - 06 Jun 2017 19:51:52 GMT - Search in distribution- jspell-dict - Command line tool to manage Jspell dictionaries
- Lingua::Jspell - Perl interface to the Jspell morphological analyser.
- jspell-installdic - automates the installation of a remote jspell dictionary
Dist::Zilla builds distributions of code to be uploaded to the CPAN. In this respect, it is like ExtUtils::MakeMaker, Module::Build, or Module::Install. Unlike those tools, however, it is not also a system for installing code that has been downloaded...
RJBS/Dist-Zilla-6.012 - 21 Apr 2018 08:22:01 GMT - Search in distributionp5stack is a tool that given a small set of configuration directives allows to quickly (in a single command) setup the required modules inside a local directory specific to this project. Including a specific perl version if required. This allows to c...
SMASH/App-p5stack-0.004 - 22 Dec 2015 12:30:56 GMT - Search in distributionA simple frontend to the identify function of "Lingua::Identify::CLD"....
AMBS/Lingua-Identify-CLD-0.10 - 01 Nov 2016 16:52:03 GMT - Search in distribution- Lingua::Identify::CLD - Interface to Chrome language detection library.
AMBS/Arango-Tango-0.010 - 01 May 2019 19:59:23 GMT - Search in distribution
- Arango::Tango::Collection - ArangoDB Collection object
- Arango::Tango::Database - ArangoDB Database object
- Arango::Tango::Cursor - ArangoDB Cursor object
- 1 more result from Arango-Tango »
Inline::Files generalizes the notion of the "__DATA__" marker and the associated "<DATA>" filehandle, to an arbitrary number of markers and associated filehandles. When you add the line: use Inline::Files; to a source file you can then specify an arb...
AMBS/Inline-Files-0.71 - 31 Mar 2019 13:05:55 GMT - Search in distributionn-Gram analysis is a field in textual analysis which uses sliding window character sequences in order to aid topic analysis, language determination and so on. The n-gram spectrum of a document can be used to compare and filter documents in multiple l...
AMBS/Text-Ngram-0.15 - 17 Jul 2014 15:52:11 GMT - Search in distribution