HEBCI - HTML Entity Based Codepage Inference
use Encode::HEBCI; $hebci = new Encode::HEBCI(); @fingerprint_entities = $hebci->supported_entities(); @possible_encodings = $hebci->fingerprint(%entities_to_values);
Encode::HEBCI module provides a mechanism to determine the character encoding used to submit an HTML form. It does this by using the encoded values of specially-chosen HTML entities to infer which encodings were possibly used, returning a list to the user.
Full details are available at the HEBCI homepage, http://www.joshisanerd.com/set/.
To use the module, simply
Returns a new HEBCI object.
Returns an array containing the HTML entities that will give the best fingerprint, in order of decreasing utility.
Returns an array of possible encodings given the values in
%entity_values should be a hash with keys of HTML entity names (i.e. without the ampersand or semicolon) to the raw bytes returned to your application by the webbrowser.
Returns an array containing the encodings this copy of HEBCI can distinguish between.
Returns the fingerprint table. You probably don't want this.
An example CGI is distributed with the source code.