NAME

XML::Rewrite - schema based XML cleanups

INHERITANCE

 XML::Rewrite
   is a XML::Compile::Cache
   is a XML::Compile::Schema
   is a XML::Compile

 XML::Rewrite is extended by
   XML::Rewrite::Schema

SYNOPSIS

 my $rewriter = XML::Rewriter->new(...);
 my ($type, $data) = $rewriter->process($file);
 my $doc = $rewriter->buildDOM($type => $data);

DESCRIPTION

Often, XML messages and schemas are created by automatic tools. These tools may provide very nice user interfaces, but tend to produce horrible XML. If you have to read these ugly products, you are in for pain. The purpose of this module (and the script xmlrewrite which is part of this distribution) is to be able to rewrite XML messages and Schema's into something maintainable.

The main difference between this module and other beautifiers is that the clean-up is based on schema rules. For instance, it is permitted to remove blanks around and inside integers, but not in strings. Beautifiers which do not look into the schema have only limited possibilities for cleanup, or may accidentally change the message content.

Feel invited to contribute ideas of useful features.

Extends "DESCRIPTION" in XML::Compile::Cache.

METHODS

Extends "METHODS" in XML::Compile::Cache.

Constructors

Extends "Constructors" in XML::Compile::Cache.

XML::Rewrite->new( [SCHEMA], OPTIONS )

The rewrite object is based on an XML::Compile::Cache object, which defines the message structures. The processing instructions can only be specified at instance creation, because we need to be able to reuse the compiled translators when we wish to process multiple messages.

 -Option               --Defined in          --Default
  allow_undeclared       XML::Compile::Cache   <true>
  any_element            XML::Compile::Cache   'ATTEMPT'
  blanks_before                                'NONE'
  block_namespace        XML::Compile::Schema  []
  change                                       'TRANSFORM'
  comments                                     'KEEP'
  defaults_writer                              'IGNORE'
  hook                   XML::Compile::Schema  undef
  hooks                  XML::Compile::Schema  []
  ignore_unused_tags     XML::Compile::Schema  <false>
  key_rewrite            XML::Compile::Schema  []
  opts_readers           XML::Compile::Cache   []
  opts_rw                XML::Compile::Cache   []
  opts_writers           XML::Compile::Cache   []
  output_compression                           <undef>
  output_encoding                              <undef>
  output_standalone                            <undef>
  output_version                               <undef>
  parser_options         XML::Compile          <many>
  prefixes               XML::Compile::Cache   <smart>
  remove_elements                              []
  schema_dirs            XML::Compile          undef
  typemap                XML::Compile::Cache   {}
  use_default_namespace                        <false>
  xsi_type               XML::Compile::Cache   {}
allow_undeclared => BOOLEAN
any_element => CODE|'TAKE_ALL'|'SKIP_ALL'|'ATTEMPT'|'SLOPPY'
blanks_before => 'ALL'|'CONTAINERS'|'NONE'

Automatically put a blank line before each child of the root element, for ALL childs, or only those which have childs themselves. But _BLANK_LINE in the HASH output of the reader, to change the selection on specific locations.

block_namespace => NAMESPACE|TYPE|HASH|CODE|ARRAY
change => 'REPAIR'|'TRANSFORM'

How to behave: either overrule the message settings (repair broken messages), or to change the output. If you wish both a correction and a transformation, you will need to call the rewrite twice (create to rewriter objects).

comments => 'REMOVE'|'KEEP'

Comments found in the input may get translated in _COMMENT and _COMMENT_AFTER fields in the intermediate HASH. You may add your own, before you reconstruct the DOM. Comments are expected to be used just before the element they belong to.

defaults_writer => 'EXTEND'|'IGNORE'|'MINIMAL'

See XML::Compile::Schema::compile(default_values)

hook => $hook|ARRAY
hooks => ARRAY
ignore_unused_tags => BOOLEAN|REGEXP
key_rewrite => HASH|CODE|ARRAY
opts_readers => HASH|ARRAY-of-PAIRS
opts_rw => HASH|ARRAY-of-PAIRS
opts_writers => HASH|ARRAY-of-PAIRS
output_compression => -1, 0-8

Set output compression level. A value of -1 means that there should be no compression. By default, the compression level of the input document is used.

output_encoding => CHARSET

The character-set is usually copied from the source document, but you can overrule this. If neither the rewriter object nor the document defined a encoding, then UTF-8 is used.

output_standalone => BOOLEAN|'yes'|'no'

If specified, it will overrule the value found in the document. If not provided, the value from the source document will be used, but only when present.

output_version => STRING

The XML version for the document. This overrules the version found in the document. If neither is specified, then 1.0 is used.

parser_options => HASH|ARRAY
prefixes => HASH|ARRAY-of-PAIRS
remove_elements => ARRAY

All the selected elements are removed. However: you shall not remove elements which are required.

schema_dirs => $directory|ARRAY-OF-directories
typemap => HASH|ARRAY
use_default_namespace => BOOLEAN

If true, the blank prefix will be used for the first name-space needed (usually the name-space of the top-level element). Otherwise, the blank prefix will not be used unless already defined explicitly in the provided prefix table.

xsi_type => HASH|ARRAY

Accessors

Extends "Accessors" in XML::Compile::Cache.

$obj->addHook($hook|LIST|undef)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addHooks( $hook, [$hook, ...] )

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addKeyRewrite($predef|CODE|HASH, ...)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addSchemaDirs(@directories|$filename)
XML::Rewrite->addSchemaDirs(@directories|$filename)

Inherited, see "Accessors" in XML::Compile

$obj->addSchemas($xml, %options)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addTypemap(PAIR)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addTypemaps(PAIRS)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->addXsiType( [HASH|ARRAY|LIST] )

Inherited, see "Accessors" in XML::Compile::Cache

$obj->allowUndeclared( [BOOLEAN] )

Inherited, see "Accessors" in XML::Compile::Cache

$obj->anyElement('ATTEMPT'|'SLOPPY'|'SKIP_ALL'|'TAKE_ALL'|CODE)

Inherited, see "Accessors" in XML::Compile::Cache

$obj->blockNamespace($ns|$type|HASH|CODE|ARRAY)

Inherited, see "Accessors" in XML::Compile::Schema

$obj->hooks( [<'READER'|'WRITER'>] )

Inherited, see "Accessors" in XML::Compile::Schema

$obj->typemap( [HASH|ARRAY|PAIRS] )

Inherited, see "Accessors" in XML::Compile::Cache

$obj->useSchema( $schema, [$schema, ...] )

Inherited, see "Accessors" in XML::Compile::Schema

Prefix management

Extends "Prefix management" in XML::Compile::Cache.

$obj->addNicePrefix(BASE, NAMESPACE)

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->addPrefixes( [PAIRS|ARRAY|HASH] )

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->learnPrefixes($node)

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->prefix($prefix)

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->prefixFor($uri)

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->prefixed( $type|<$ns,$local> )

Inherited, see "Prefix management" in XML::Compile::Cache

$obj->prefixes( [$params] )

Inherited, see "Prefix management" in XML::Compile::Cache

Compilers

Extends "Compilers" in XML::Compile::Cache.

$obj->addCompileOptions( ['READERS'|'WRITERS'|'RW'], %options )

Inherited, see "Compilers" in XML::Compile::Cache

$obj->compile( <'READER'|'WRITER'>, $type, %options )

Inherited, see "Compilers" in XML::Compile::Schema

$obj->compileAll( ['READERS'|'WRITERS'|'RW', [$ns]] )

Inherited, see "Compilers" in XML::Compile::Cache

$obj->compileType( <'READER'|'WRITER'>, $type, %options )

Inherited, see "Compilers" in XML::Compile::Schema

$obj->dataToXML($node|REF-XML|XML-STRING|$filename|$fh|$known)
XML::Rewrite->dataToXML($node|REF-XML|XML-STRING|$filename|$fh|$known)

Inherited, see "Compilers" in XML::Compile

$obj->initParser(%options)
XML::Rewrite->initParser(%options)

Inherited, see "Compilers" in XML::Compile

$obj->reader($type|$name, %options)

Inherited, see "Compilers" in XML::Compile::Cache

$obj->template( <'XML'|'PERL'|'TREE'>, $element, %options )

Inherited, see "Compilers" in XML::Compile::Schema

$obj->writer($type|$name)

Inherited, see "Compilers" in XML::Compile::Cache

Administration

Extends "Administration" in XML::Compile::Cache.

$obj->declare( <'READER'|'WRITER'|'RW'>, <$type|ARRAY>, %options )

Inherited, see "Administration" in XML::Compile::Cache

$obj->doesExtend($exttype, $basetype)

Inherited, see "Administration" in XML::Compile::Schema

$obj->elements()

Inherited, see "Administration" in XML::Compile::Schema

$obj->findName($name)

Inherited, see "Administration" in XML::Compile::Cache

$obj->findSchemaFile($filename)
XML::Rewrite->findSchemaFile($filename)

Inherited, see "Administration" in XML::Compile

$obj->importDefinitions($xmldata, %options)

Inherited, see "Administration" in XML::Compile::Schema

$obj->knownNamespace($ns|PAIRS)
XML::Rewrite->knownNamespace($ns|PAIRS)

Inherited, see "Administration" in XML::Compile

$obj->namespaces()

Inherited, see "Administration" in XML::Compile::Schema

$obj->printIndex( [$fh], %options )

Inherited, see "Administration" in XML::Compile::Cache

$obj->types()

Inherited, see "Administration" in XML::Compile::Schema

$obj->walkTree($node, CODE)

Inherited, see "Administration" in XML::Compile

Processing

$obj->buildDOM(TYPE, DATA, OPTIONS)
$obj->process(XMLDATA, OPTIONS)

XMLDATA must be XML as accepted by dataToXML(). Returned is LIST of two: the type of the data-structure read, and the HASH representation of the contained data.

 -Option--Default
  type    <from root element>
type => TYPE

Explicit TYPE of the root element, required in case of namespace-less elements or other namespace problems.

$obj->repairXML(TYPE, XML, DETAILS)

The TYPE of the root element, the root XML element, and DETAILS about the xml origin.

$obj->transformData(TYPE, DATA, DETAILS)

The TYPE of the root element, the HASH representation of the DATA of the message, and DETAILS about the xml origin.

DETAILS

Extends "DETAILS" in XML::Compile::Cache.

DESCRIPTIONS

Extends "DESCRIPTIONS" in XML::Compile::Cache.

SEE ALSO

This module is part of XML-Rewrite distribution version 0.11, built on May 11, 2018. Website: http://perl.overmeer.net/CPAN/

LICENSE

Copyrights 2008-2018 by [Mark Overmeer <markov@cpan.org>]. For other contributors see ChangeLog.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://dev.perl.org/licenses/