NAME
XML::Struct - Represent XML as data structure preserving element order
VERSION
version 0.20
SYNOPSIS
use XML::Struct qw(readXML writeXML simpleXML removeXMLAttr);
my $xml = readXML( "input.xml" );
# [ root => { xmlns => 'http://example.org/' }, [ '!', [ x => {}, [42] ] ] ]
my $doc = writeXML( $xml );
# <?xml version="1.0" encoding="UTF-8"?>
# <root xmlns="http://example.org/">!<x>42</x></root>
my $simple = simpleXML( $xml, root => 'record' );
# { record => { xmlns => 'http://example.org/', x => 42 } }
my $xml2 = removeXMLAttr($xml);
# [ root => [ '!', [ x => [42] ] ] ]
DESCRIPTION
XML::Struct implements a mapping between XML and Perl data structures. By default, the mapping preserves element order, so it also suits for "document-oriented" XML. In short, an XML element is represented as array reference with three parts:
[ $name => \%attributes, \@children ]
This data structure corresponds to the abstract data model of MicroXML, a simplified subset of XML.
If your XML documents don't contain relevant attributes, you can also choose to map to this format:
[ $name => \@children ]
Both parsing (with XML::Struct::Reader or function readXML
) and serializing (with XML::Struct::Writer or function writeXML
) are fully based on XML::LibXML, so performance is better than XML::Simple and similar to XML::LibXML::Simple.
MODULES
- XML::Struct::Reader
-
Parse XML as stream into XML data structures.
- XML::Struct::Writer
-
Write XML data structures to XML streams for serializing, SAX processing, or creating a DOM object.
- XML::Struct::Writer::Stream
-
Simplified SAX handler for XML serialization.
FUNCTIONS
The following functions are exported on request:
readXML( $source [, %options ] )
Read an XML document with XML::Struct::Reader. The type of source (string, filename, URL, IO Handle...) is detected automatically. Options not known to XML::Struct::Reader are passed to XML::LibXML::Reader.
writeXML( $xml [, %options ] )
Write an XML document/element with XML::Struct::Writer.
simpleXML( $element [, %options ] )
Transform an XML document/element into simple key-value format as known from XML::Simple: Attributes and child elements are treated as hash keys with their content as value. Text elements without attributes are converted to text and empty elements without attributes are converted to empty hashes. The following options are supported:
- root
-
Keep the root element (just as option
KeepRoot
in XML::Simple). In addition one can set the name of the root element if a non-numeric value is passed. - depth
-
Only transform to a given depth. See XML::Struct::Reader for documentation.
All elements below the given depth are returned unmodified (not cloned) as array elements:
$data = simpleXML($xml, depth => 2) $content = $data->{x}->{y}; # array or scalar (if existing)
- attributes
-
Assume input without attributes if set to a true value. The special value
remove
will first remove attributes, so the following three are equivalent:my @children = (['a'],['b']); simpleXML( [ $name => \@children ], attributes => 0 ); simpleXML( removeXMLAttr( [ $name => \%attributes, \@children ] ), attributes => 0 ); simpleXML( [ $name => \%attributes, \@children ], attributes => 'remove' );
Key attributes (KeyAttr
in XML::Simple) and the option ForceArray
are not supported yet.
removeXMLAttr( $element )
Transform XML structure with attributes to XML structure without attributes. The function does not modify the passed element but creates a modified copy.
EXAMPLE
To give an example, with XML::Struct::Reader, this XML document:
<root>
<foo>text</foo>
<bar key="value">
text
<doz/>
</bar>
</root>
is transformed to this structure:
[
"root", { }, [
[ "foo", { }, "text" ],
[ "bar", { key => "value" }, [
"text",
[ "doz", { }, [ ] ]
]
]
]
This module also supports a simple key-value (aka "data-oriented") format, as used by XML::Simple. With option simple
(or function simpleXML
) the document given above woule be transformed to this structure:
{
foo => "text",
bar => {
key => "value",
doz => {}
}
}
SEE ALSO
This module was first created to be used in Catmandu::XML and turned out to also become a replacement for XML::Simple. See the former for more XML processing.
XML::Twig is another popular and powerfull module for stream-based processing of XML documents.
See XML::Smart, XML::Hash::LX, XML::Parser::Style::ETree, XML::Fast, and XML::Structured for different representations of XML data as data structures (feel free to implement converters from/to XML::Struct). XML::GenericJSON seems to be an outdated and incomplete attempt to capture more parts of XML Infoset in another data structure.
See JSONx for a kind of reverse direction (JSON in XML).
AUTHOR
Jakob Voß
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Jakob Voß.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.