MKDoc::XML - The MKDoc XML Toolkit


This is an article, not a module.


MKDoc is a web content management system written in Perl which focuses on standards compliance, accessiblity and usability issues, and multi-lingual websites.

At MKDoc Ltd we have decided to gradually break up our existing commercial software into a collection of completely independent, well-documented, well-tested open-source CPAN modules.

Ultimately we want MKDoc code to be a coherent collection of module distributions, yet each distribution should be usable and useful in itself.

MKDoc::XML is part of this effort.

You could help us and turn some of MKDoc's code into a CPAN module. You can take a look at the existing code at

If you are interested in some functionality which you would like to see as a standalone CPAN module, send an email to <>.


MKDoc::XML is a low level XML library.
MKDoc::XML::* modules do not make sure your XML is well-formed.
MKDoc::XML::* modules can be used to work with somehow broken XML.
MKDoc::XML::* modules should not be used as high-level parsers with general purpose XML unless you know what you're doing.


XML tokenizer

MKDoc::XML::Tokenizer splits your XML / XHTML files into a list of MKDoc::XML::Token objects using a single regex.

XML tree builder

MKDoc::XML::TreeBuilder sits on top of MKDoc::XML::Tokenizer and builds parsed trees out of your XML / XHTML data.

XML stripper

MKDoc::XML::Stripper objects removes unwanted markup from your XML / HTML data. Useful to remove all those nasty presentational tags or 'style' attributes from your XHTML data for example.

XML tagger

MKDoc::XML::Tagger module matches expressions in XML / XHTML documents and tag them appropriately. For example, you could automatically hyperlink certain glossary words or add <abbr> tags based on a dictionary of abbreviations and acronyms.

XML entity decoder

MKDoc::XML::Decode is a pluggable, configurable entity expander module which currently supports html entities, numerical entities and basic xml entities.

XML entity encoder

MKDoc::XML::Encode does the exact reverse operation as MKDoc::XML::Decode.

XML Dumper

MKDoc::XML::Dumper serializes arbitrarily complex perl structures into XML strings. It is also able of doing the reverse operation, i.e. deserializing an XML string into a perl structure.


Copyright 2003 - MKDoc Holdings Ltd.

Author: Jean-Michel Hiver

This module is free software and is distributed under the same license as Perl itself. Use it at your own risk.



Help us open-source MKDoc. Join the mkdoc-modules mailing list: