The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


Cz::Cstocs - conversions of charset encodings for the Czech language


        use Cz::Cstocs;
        my $il2_to_ascii = new Cz::Cstocs 'il2', 'ascii';
        while (<>) {
                print &$il2_to_ascii($_);

        use Cz::Cstocs 'il2_ascii';
        while (<>) {
                print il2_ascii($_);

        use Cz::Cstocs;
        sub il2toascii;
                # inform the parser that there is a function il2toascii
        *il2toascii = new Cz::Cstocs 'il2', 'ascii';
                # now define the function
        print il2toascii $data;
                # thanks to Jan Krynicky for poining this out


This module helps in converting texts between various charset encodings, used for Czech and Slovak languages. The instance of the object Cz::Cstocs is created using method new. It takes at least two parameters for input and output encoding and can be afterwards used as a function reference to convert strings/lists. Cz::Cstocs supports fairly free form of aliases, so iso8859-2, ISO-8859-2, iso88592 and il2 are all aliases of the same encoding. For backward compatibility, method conv is supported as well, so the example above could also read

        while (<>) {
                print $il2_to_ascii->conv($_);

You can also use typeglob syntax.

The conversion function takes a list and returns list of converted strings (in the list context) or one string consisting of concatenated results (in the scalar context).

You can modify the behaviour of the conversion function by specifying hash of other options after the encoding names in call to new.


Gives alternate string that will replace characters from input encoding that are not present in the output encoding. Default is space.


Defines whether the accent file should be used. Default is 1 (true).


When 1 (true), will keep characters that do not have friends in accent nor output encoding, will no replace them with fillstring. Default is 0 except for tex, because you probably rather want to keep backslashed symbols than loose them.


Alternate location for encoding and accent files. The default is the Cz/Cstocs/enc directory in Perl library tree. This location can also be changed with the CSTOCSDIR environment variable.

There is an alternate way to define the conversion function: any arguments after use Cz::Cstocs that have form encoding_encoding or encoding_to_encoding are processed and the appropriate functions are imported. So,

        use Cz::Cstocs qw(pc2_to_il2 il2_ascii);

define two functions, that are loaded into caller's namespace and can be used directly. In this case, you cannot specify additional options, you only have default behaviour.


If you request an unknown encoding in the call to new Cz::Cstocs, the conversion object is not defined and the variable $Cz::Cstocs::errstr is set to the error message. When you specify unknown encoding in the use call style (like use Cz::Cstocs 'il2_ascii';), the die is called.


Jan Pazdziora,, created the module version.

Jan "Yenya" Kasprzak has done the original Un*x implementation.




cstocs(1), perl(1), or Xcstocs at