NAME

Lingua::RU::Detect - Heuristics for guessing encoding sequence

SYNOPSIS

        use Lingua::RU::Detect "detect_enc";
        say Dumper(detect_enc("бНОПНЯ")); 
        say Dumper(detect_enc("бОДТЕК"));

ABSTRACT

Lingua::RU::Detect make a guess of how the original text was reconverted with a sequence of different encodings.

DESCRIPTION

This module is a heart of http://decodr.ru/ website which provides a tool for automatic recovering Russian texts which were damaged by multiple transcodings. Two and three item chains are now available to detect, and the speed is much higher than that of programmes based on a dictionary.

The result of calling detect_enc subroutine is a list of encoding pairs. To get original UTF-8 string you need to make all these transcodings in the order specified in the array returned. For example:

        $VAR1 = [
                [
                        'UTF-8',
                        'ISO-8859-5'
                ],
                [
                        'KOI8-R',
                        'UTF-8'
                ]
        ];

If no reencoding is needed, result is an empty array.

For test suite refer to Wikipedia page http://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D0%BA%D0%BE%D0%B7%D1%8F%D0%B1%D1%80%D1%8B (not all of them pass current version).

AUTHOR

Andrew Shitov, <andy@shitov.ru>

COPYRIGHT AND LICENSE

Lingua::RU::Detect module is a free software. You may redistribute and (or) modify it under the same terms as Perl 5.10.

To install Lingua::RU::Detect, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lingua::RU::Detect

CPAN shell

perl -MCPAN -e shell
install Lingua::RU::Detect

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)