The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


XML::Char - validate characters for XML


    use XML::Char;
    if (not XML::Char->valid("bell ".chr(7))) {
        die 'no way to store this string directly to XML';

    use utf8;
    use XML::Char;
    if (XML::Char->valid("UTF8 je pořádný peklo")) {
        print "fuf, we are fine\n";


For me it was kind of a surprised to learn that char(0) is a valid UTF-8 character. All of the 0-0x7F are...

    Emo: well it's not because that they are valid utf-8 characters that you have to expect XML to accept them

Well of course not, now I know :-) defines which characters XML processors MUST accept:

    [2]         Char       ::=          #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

This module validates if a given string meets this criteria. In addition the string has to be a Perl UTF-8 string (is_utf8_string() - see "Unicode-Support" in perlapi).


Returns true or false if $value consists of valid UTF-8 XML characters.


How can I strip invalid XML characters from strings in Perl?

Extensible Markup Language (XML) 1.0

Extensible Markup Language (XML) 1.1


Jozef Kutej

Aristotle Pagaltzis - completely rewrote the initial Char.XS to handle the SvUTF8 flag


Copyright 2009 Jozef Kutej, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.