Lingua::EN::Sentence - split text into sentences River stage two • 8 direct dependents • 10 total dependents

The "Lingua::EN::Sentence" module contains the function get_sentences, which splits text into its constituent sentences, based on a regular expression and a list of abbreviations (built in and given). Certain well know exceptions, such as abbreviatio...

KIMRYAN/Lingua-EN-Sentence-0.31 - 19 Aug 2018 08:28:30 UTC - Search in distribution

Lingua::EN::Semtags::Sentence - a DTO used by Lingua::EN::Semtags::Engine River stage zero No dependents

A DTO used by "Lingua::EN::Semtags::Engine". Aggregates instances of "Lingua::EN::Semtags::LangUnit"s. METHODS add_lunit($lunit) Adds $lunit to "$self->{lunits}". lunits() Returns "$self->{lunits}". phrase_tokens() Returns "$self->{phrase_tokens}". R...

IGORM/Lingua-EN-Semtags-Engine-0.01 - 25 Apr 2008 17:48:16 UTC - Search in distribution

Lingua::EN::Sentence::Offsets - Finds sentence boundaries, and returns their offsets. River stage zero No dependents

ANDREFS/Lingua-EN-Sentence-Offsets-0.03 - 03 Mar 2014 11:40:09 UTC - Search in distribution

FL3 - A shortcut module for Lingua::FreeLing3. River stage one • 1 direct dependent • 1 total dependent

Implements a set of utility functions to access "Lingua::FreeLing3" objects. Everytime one of the accessors is used just with the language code/language data file (or using the default language), the cached processor is returned if it exists. If any ...

AMBS/Lingua-FreeLing3-0.09 - 12 Jan 2014 16:21:27 UTC - Search in distribution

treealign - training tree alignment classifiers and aligning syntactic trees River stage zero No dependents

This script allows you to train a tree alignment model and to apply them to parallel treebanks. Tree alignment is based on local binary classification and rich feature sets. Currently, training data has to be in Stockholm Tree Aligner format. The out...

TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC - Search in distribution

umls-allwords-senserelate.pl - This program performs all-words word sense disambiguation and assigns senses from the UMLS to each ambiguous term in a runnning text using semantic similarity measures. River stage zero No dependents

BTMCINNES/UMLS-SenseRelate-0.29 - 24 Jul 2013 09:28:12 UTC - Search in distribution

Text::Info - Retrieve information about, and do analysis on, text. River stage zero No dependents

Text::Info is an extensible and easy to use solution for retrieving useful information about texts based on the Germanic languages <https://en.wikipedia.org/wiki/Germanic_languages>. For the time being it has a limited feature set, but the plan is to...

TOREAU/Text-Info-0.01 - 31 Aug 2015 06:00:33 UTC - Search in distribution

Date::Baha::i - Convert to and from Baha'i dates River stage zero No dependents

This package renders the Baha'i date from two standard date formats - epoch time and a (year, month, day) triple. It also converts a Baha'i date to standard ymd format. CYCLES Each cycle of nineteen years is called a Vahid. Nineteen cycles constitute...

GENE/Date-Baha-i-0.2100 - 02 Oct 2018 22:17:50 UTC - Search in distribution

Lingua::YaTeA - Perl extension for extracting terms from a corpus and providing a syntactic analysis in a head-modifier format. River stage one • 1 direct dependent • 1 total dependent

This module is the main module of the software named YaTeA. It aims at extracting noun phrases that look like terms from a corpus. It provides their syntactic analysis in a head-modifier representation. As an input, the term extractor requires a corp...

THHAMON/Lingua-YaTeA-0.626 - 26 Oct 2018 12:48:02 UTC - Search in distribution

Locale::Maketext::TPJ13 - article about software localization River stage three • 28 direct dependents • 119 total dependents

The following article by Sean M. Burke and Jordan Lachler first appeared in *The Perl Journal* #13 and is copyright 1999 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal. This document may be distributed under the same terms a...

TODDR/Locale-Maketext-1.29 - 20 Jan 2020 05:04:23 UTC - Search in distribution

Text::Capitalize - capitalize strings ("to WORK AS titles" becomes "To Work as Titles") River stage one • 2 direct dependents • 3 total dependents

Text::Capitalize provides some routines for title-like formatting of strings. The simple capitalize function just makes the inital character of each word uppercase, and forces the rest to lowercase. The capitalize_title function applies English title...

DOOM/Text-Capitalize-1.5 - 27 Sep 2019 02:25:45 UTC - Search in distribution

Lingua::NATools - A framework for Parallel Corpora processing River stage zero No dependents

This is a collection of functions used on the NATools tools. Some of them can be used independently. Check documentation bellow. "init" Use this function to initialize a parallel corpora repository. You must supply a "directory" where the repository ...

AMBS/Lingua-NATools-v0.7.12 - 09 Nov 2020 21:40:02 UTC - Search in distribution

Text::GaleChurch - Perl extension for aligning translated sentences River stage zero No dependents

This module aligns the sentences of paragraphs in two languages in a way that the aligned sentences are likely translations of each other. This is useful for applications in machine translation and other applications where sentence-aligned parallel c...

ACHIMRU/Text-GaleChurch-1.00 - 13 Mar 2010 15:57:12 UTC - Search in distribution

Lingua::EN::Inflexion - Inflect English nouns, verbs, adjectives, and articles River stage one • 7 direct dependents • 7 total dependents

Lingua::EN::Inflexion allows you to correctly inflect all English nouns and verbs, as well as the small number of adjectives and articles that still decline in modern English. By default, the module follows the conventions of modern formal British En...

DCONWAY/Lingua-EN-Inflexion-0.002000 - 26 Jul 2020 22:07:31 UTC - Search in distribution

Text::StemTagPOS - Computes stemmed/POS tagged lists of text. River stage one • 2 direct dependents • 2 total dependents

"Text::StemTagPOS" uses the modules Lingua::Stem::Snowball and Lingua::EN::Tagger to do part-of-speech tagging and stemming of English text. It was developed to pre-process text for other modules. Encoding of all text should be in Perl's internal for...

KUBINA/Text-StemTagPOS-0.61 - 31 Dec 2011 13:41:21 UTC - Search in distribution

Lingua::EN::Alphabet::Shaw - transliterate the Latin to Shavian alphabets River stage one • 1 direct dependent • 1 total dependent

The Shaw or Shavian alphabet was commissioned by the will of the playwright George Bernard Shaw in the early 1960s as a replacement for the Latin alphabet for representing English. It is designed to have a one-to-one phonemic (not phonetic) mapping w...

MARNANEL/Lingua-EN-Alphabet-Shaw-0.64 - 16 Sep 2010 12:24:57 UTC - Search in distribution

Text::Summarize::En - Routine to summarize English text. River stage zero No dependents

"Text::Summarize" contains routines for ranking the sentences in English text for inclusion in a summary using the sumBasic algorithm....

KUBINA/Text-Summarize-0.50 - 12 Mar 2012 17:24:35 UTC - Search in distribution

Text::Corpus::CNN::Document - Parse CNN article for research. River stage zero No dependents

"Text::Corpus::CNN::Document" provides methods for accessing specific portions of CNN news articles for personnel researching and testing of information processing methods. Read the CNN Interactive Service Agreement to ensure you abide with their Ser...

KUBINA/Text-Corpus-CNN-1.02 - 21 Aug 2010 16:33:02 UTC - Search in distribution

Lingua::EN::Fathom - Measure readability of English text River stage one • 1 direct dependent • 1 total dependent

This module analyses English text in either a string or file. Totals are then calculated for the number of characters, words, sentences, blank and non blank (text) lines and paragraphs. Three common readability statistics are also derived, the Fog, F...

KIMRYAN/Lingua-EN-Fathom-1.22 - 31 Oct 2018 21:39:45 UTC - Search in distribution

Lingua::Sentence - Perl extension for breaking text paragraphs into sentences River stage one • 4 direct dependents • 4 total dependents

This module allows splitting of text paragraphs into sentences. It is based on scripts developed by Philipp Koehn and Josh Schroeder for processing the Europarl corpus (<http://www.statmt.org/europarl/>). The module uses punctuation and capitalizatio...

CAPOEIRAB/Lingua-Sentence-1.100 - 26 Feb 2017 23:06:04 UTC - Search in distribution
38 results (0.073 seconds)