NAME
Lingua::Stem::UniNE::DE - German stemmer
VERSION
This document describes Lingua::Stem::UniNE::DE v0.08.
SYNOPSIS
use Lingua::Stem::UniNE::DE qw( stem_de );
my $stem = stem_de($word);
# alternate syntax
$stem = Lingua::Stem::UniNE::DE::stem($word);
DESCRIPTION
Light and aggressive stemmers for the German language. The light stemmer removes plural endings and umlauts. The aggressive stemmer also removes inflectional suffixes and additional diacritics.
This module provides the stem
and stem_de
functions for the light stemmer, which are synonymous and can optionally be exported, plus stem_aggressive
and stem_de_aggressive
functions for the light stemmer. They accept a single word and return a single stem.
NOTES
“In proposing stemmers for other languages than English, we think that a ‘light’ stemmer (removing inflections only for noun and adjectives) presents some advantages. […] In German, a few rules may be applied to obtain the plural form of words (e.g., ‘Frau’ into ‘Frauen’ (woman), ‘Bild’ into ‘Bilder’ (picture), ‘Sohn’ into ‘Söhne’ (son), ‘Apfel’ into ‘Äpfel’ (apple)), but the suggested algorithms do not account for person and tense variations, or for the morphological variations used by verbs (we think that indexing verbs for Italian, French or German is not of primary importance compared to nouns and adjectives).” —Jacques Savoy, IR Multilingual Resources at UniNE
“For the German corpus, Porter’s stemmer provided better retrieval performance than did the UniNE scheme (average difference of 3.7% over nine IR models). The difference between these two stemming schemes however was never statistically significant.” —Jacques Savoy, Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages
SEE ALSO
Lingua::Stem::UniNE provides a stemming object with access to all of the implemented University of Neuchâtel stemmers including this one. It has additional features like stemming lists of words.
Lingua::Stem::Any provides a unified interface to any stemmer on CPAN, including this one, as well as additional features like normalization, casefolding, and in-place stemming.
This modules is based on stemming algorithms by Jacques Savoy of the University of Neuchâtel and implemented in C (light, aggressive).
AUTHOR
Nick Patch <patch@cpan.org>
This module is brought to you by Shutterstock. Additional open source projects from Shutterstock can be found at code.shutterstock.com.
COPYRIGHT AND LICENSE
© 2014 Shutterstock, Inc.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.