The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::SAX::Writer - SAX2 XML Writer

SYNOPSIS

  use XML::SAX::Writer;
  use XML::SAX::SomeDriver;

  my $w = XML::SAX::Writer->new;
  my $d = XML::SAX::SomeDriver->new(Handler => $w);

  $d->parse('some options...');

DESCRIPTION

Why yet another XML Writer ?

A new XML Writer was needed to match the SAX2 effort because quite naturally no existing writer understood SAX2. My first intention had been to start patching XML::Handler::YAWriter as it had previously been my favourite writer in the SAX1 world.

However the more I patched it the more I realised that what I thought was going to be a simple patch (mostly adding a few event handlers and changing the attribute syntax) was turning out to be a rewrite due to various ideas I'd been collecting along the way. Besides, I couldn't find a way to elegantly make it work with SAX2 without breaking the SAX1 compatibility which people are probably still using. There are of course ways to do that, but most require user interaction which is something I wanted to avoid.

So in the end there was a new writer. I think it's in fact better this way as it helps keep SAX1 and SAX2 separated.

METHODS

  • new(%hash)

    This is the constructor for this object. It takes a number of parameters, all of which are optional.

  • -- Output

    This parameter can be one of several things. If it is a simple scalar, it is interpreted as a filename which will be opened for writing. If it is a scalar reference, output will be appended to this scalar. If it is an array reference, output will be pushed onto this array as it is generated. If it is a filehandle, then output will be sent to this filehandle.

    Finally, it is possible to pass an object for this parameter, in which case it is assumed to be an object that implements the consumer interface described later in the documentation.

    If this parameter is not provided, then output is sent to STDOUT.

  • -- Escape

    This should be a hash reference where the keys are characters sequences that should be escaped and the values are the escaped form of the sequence. By default, this module will escape the ampersand (&), less than (<), greater than (>), double quote ("), apostrophe ('), and double dash (--) character sequences. Note that the double dash escape is needed for comments, and that some browsers don't support the &apos; escape used for apostrophes so that you should be careful when outputting XHTML.

    If you only want to add entries to the Escape hash, you can first copy the contents of %XML::SAX::Writer::DEFAULT_ESCAPE.

  • -- EncodeFrom

    The character set encoding in which incoming data will be provided. This defaults to UTF-8, which works for US-ASCII as well.

  • -- EncodeTo

    The character set encoding in which output should be encoded. Again, this defaults to UTF-8.

GENERATING XML

THE CONSUMER INTERFACE

XML::SAX::Writer can receive pluggable consumer objects that will be in charge of writing out the XML formatted by this module. Setting a Consumer is done by setting the Output option to the object of your choice instead of to an array, scalar, or file handle as is more commonly done (internally those in fact map to Consumer classes and and simply available as options for your convienience).

If you don't understand this, don't worry. You don't need it most of the time.

That object can be from any class, but must have two methods in its API. It is also strongly recommended that it inherits from XML::SAX::Writer::ConsumerInterface so that it will not break if that interface evolves over time. There are examples at the end of XML::SAX::Writer's code.

The two methods that it needs to implement are:

  • output(String)

    This is called whenever the Writer wants to output a string formatted in XML. Encoding conversion, character escaping, and formatting have already taken place. It's up to the consumer to do whatever it wants with the string.

  • finalize()

    This is called once the document has been output in its entirety, during the end_document event. end_document will in fact return whatever finalize() returns, and that in turn should be returned by parse() for whatever parser was invoked. It might be useful if you need to provide feedback of some sort.

TODO

    - make the quote character an option. By default it is here ', but
    I know that a lot of people (for reasons I don't understand but
    won't question :-) prefer to use ". (on most keyboards " is more
    typing, on the rest it's often as much typing).

    - the formatting options need to be developed.

    - test, test, test (and then some tests)

    - doc, doc, doc (actually this part is in better shape)

    - add support for Perl 5.7's Encode module so that we can use it
    instead of Text::Iconv. Encode is more complete and likely to be
    better supported overall. This will be done using a pluggable
    encoder (so that users can provide their own if they want to)
    and detecter both in Makefile.PL requirements and in the module
    at runtime.

CREDITS

Michael Koehne (XML::Handler::YAWriter) for much inspiration and Barrie Slaymaker for the Consumer pattern idea. Of course the usual suspects (Kip Hampton and Matt Sergeant) helped in the usual ways.

AUTHOR

Robin Berjon, robin@knowscape.com

COPYRIGHT

Copyright (c) 2001 Robin Berjon. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

XML::SAX::*

5 POD Errors

The following errors were encountered while parsing the POD:

Around line 800:

Non-ASCII character seen before =encoding in ' It'. Assuming CP1252

Around line 803:

Expected '=item *'

Around line 819:

Expected '=item *'

Around line 833:

Expected '=item *'

Around line 838:

Expected '=item *'