The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

CharsetDetector - Detect charset

SYNOPSIS

  use CharsetDetector;
        
        my $binary = "...";
        my $charset_str = "charset = ...";
        
        my $charset = CharsetDetector::detect($binary);
        $charset = CharsetDetector::detect1($binary, 5);
        # Test the charset of $binary
        # '' for undef
        # 'iso-8859-1' for ''
        
        $charset = CharsetDetector::detect($binary, 5);
        $charset = CharsetDetector::detect1($binary, 5);
        # Test the charset of substr($binary, 0, 5)
        
        $charset = CharsetDetector::detect($charset_str);
        # Test the charset in $charset_str
        
        $charset = CharsetDetector::detect_debug($binary);
        print $CharsetDetector::log_txt;
        # you can see the log of testing $binary
        

The synopsis above only lists the major methods and parameters.

Basic Function

detect - detect charset

                $charset = CharsetDetector::detect($binary [, $max_len]);
                $charset = CharsetDetector::detect($charset_str [, $max_len]);
                # $charset_str is like "charset=..."
                # if input is '', output is 'iso-8859-1'
                # if input is undef, output is ''
                

detect1 - detect charset only in encoding

                $charset = CharsetDetector::detect1($binary [, $max_len]);
                # if input is '', output is 'iso-8859-1'
                # if input is undef, output is ''

detect_debug - detect charset and then you can see the log

                $charset = CharsetDetector::detect_debug($binary [, $max_len]);
                print $CharsetDetector::log_txt;
                # if input is '', output is 'iso-8859-1'
                # if input is undef, output is ''

COPYRIGHT

The CharsetDetector module is Copyright (c) 2003-2006 QIAN YU. All rights reserved.

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.