5 results (0.316 seconds)
Lingua::ZH::TaBE - Chinese processing via libtabe ++

This module is a Perl interface to the TaBE (Taiwan and Big5 Encoding) library, an unified interface and library dealing with Chinese words, phrases, sentences, and phonetic symbols; it is intended to be used as the foundation of Chinese text process...

AUTRIJUS/Lingua-ZH-TaBE-0.07 - 31 Dec 2005 07:37:55 GMT - Search in distribution

Lingua::ZH::Wrap - Wrap Chinese text ++

"Lingua::ZH::Wrap::wrap()" is a very simple paragraph formatter. It formats a single paragraph at a time by breaking lines at Chinese character boundries. Indentation is controlled for the first line ($initial_tab) and all subsequent lines ($subseque...

AUTRIJUS/Lingua-ZH-Wrap-0.03 - 25 Jul 2004 16:34:53 GMT - Search in distribution

Lingua::ZH::Toke - Chinese Tokenizer ++

This module puts a thin wrapper around Lingua::ZH::TaBE, by blessing refereces to TaBE's objects into its English counterparts. Besides offering more readable class names, this module also offers various overloaded methods for tokenization; please se...

AUTRIJUS/Lingua-ZH-Toke-0.02 - 11 Jan 2004 13:13:35 GMT - Search in distribution

Lingua::ZH::Keywords - Extract keywords from Chinese text ++

This is a very simple algorithm which removes stopwords from the text, and then counts up what it considers to be the most important keywords. The "keywords" subroutine returns a list of keywords in order of relevance. The stopwords list is accessibl...

AUTRIJUS/Lingua-ZH-Keywords-0.04 - 20 Jan 2003 22:42:35 GMT - Search in distribution

OurNet::FuzzyIndex - Inverted search for double-byte characters ++

OurNet::FuzzyIndex implements a simple consecutive-letter indexing mechanism specifically designed for multi-byte encoding maps, e.g. big-5 or utf8. It uses DB_File to create an associative mapping from each character to its consecutive one, utilizin...

AUTRIJUS/OurNet-FuzzyIndex-1.60 - 24 Jan 2003 00:13:23 GMT - Search in distribution

Hosting generously
sponsored by Bytemark