The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


Pod::PXML -- pxml2pod, pod2pxml


  use Pod::PXML;
  # Take from a file...
  open(XMLOUT, ">foo.xml") || die "can't wropen foo.xml: $!";
  print XMLOUT Pod::PXML::pod2xml('foo.pod');
  # Take from a file, going the other way:
  open(PODOUT, ">foo.pod") || die "can't wropen foo.pod: $!";
  print PODOUT Pod::PXML::xml2pod('foo.xml');
  # Or take from STDIN:
  print '', Pod::PXML::pod2xml(\join '', <STDIN>);
  # Or the other way;
  print '', Pod::PXML::xml2pod(\join '', <STDIN>);


Perl's documention is conventionally expressed in Plain Old Documentation.

POD-format is a wonderfully concise text format, but it is quite idiosyncratic. This module seeks to make it easier to turn text that's in POD-format into XML, and to turn text that's in XML into POD-format.


This module is experimental!! It works right on almost all data, but there are a few oddities left -- mostly in the handling of odd L<...> syntax. Some of these are because of bugs in the current Pod::Tree version (1.06), and some of these are because of basic conceptual problems in perlpod. Both of these should be cleared up eventually. If you get strange results from this module, do email me.


    TODO: document options?

    TODO: allow treating comment blocks outside paragrapms as <!-- ... -->?

    $xml_text = Pod::PXML::pod2xml($filename);

    $xml_text = Pod::PXML::pod2xml(\$content);

    Returns XML content that represents the POD-format text that was input.

    $pod_text = Pod::PXML::xml2pod($filename);

    $pod_text = Pod::PXML::xml2pod(\$content);

    Returns POD-format content that represents the PXML text that was input.


This module and the PXML DTD are still in the EXPERIMENTAL stage. If you don't like the way something works, or if you think something's broken, email me sooner rather than later! I mean for this module to be actually useful to people in their XMLificational PODulatory document doings.


This module's idea of XML isn't just any sort of XML, but is XML complying to a DTD. "PXML" is what I call the document type that my DTD declares.

The design goals of PXML are to be a 1:1 representation of all meaningful distinctions you can make in valid POD-format -- if it's a meaningful distinction you can validly express in POD-format, I want to be able to convert that to isomorphic PXML. Moreover, I want to be able to write PXML that can represent any meaningful distinction in valid POD-format, once I convert it to POD-format.

So, whether you write "$a>$b" in POD-format as "$aE<lt>$b" or as "$aE<60>$b" is not a meaningful distinction, because "E<lt>" or as "E<60>" represent the same character. However, the difference between "=head1" and "=head2" is meaningful, and the difference between "C<...>" and "F<...>" is meaningful, and so these distinctions should be present in the PXML representation of the POD.

A secondary design goal is that PXML be as minimal as possible; specifically, there shouldn't be anything in PXML (whether element or attribute) that doesn't correspond directly to some part of POD-format.

So, while you might want to represent this:

  =item Foo

as this:


or while you might want to represent this:

  =head1 Foo

as this:


...those are not the way I do it, even tho I considered both. Why did I decide against those? Because there's no "label" or "section1" in POD-format.

Instead, I do:





    For any valid POD-format input you provide, this module should emit XML that conforms to the PXML DTD. For any XML input that you feed in that comforms to the PXML DTD, this module should emit valid POD.

POD-format / PXML Correspondences

The PXML DTD is still not entirely nailed down, but once it is, then this section should be rather more verbose.

  POD-format  -------------------------------  PXML

  A normal paragraph:
  Hummina hummina?                  =    <p>Hummina hummina?
  Woozle wuzzle.                         Woozle wuzzle.</p>


  A verbatim paragraph:             =    <pre>
    while(1) {                             while(1) {
      print "Matanga!!\n";                   print "Matanga!!\n";
    }                                      }


  =head1 DEMANDS
                                    =    <head1>DEMANDS</head1>
  My list of demands:                    <p>My list of demands:</p>

(ditto for head2, head3, head4)


  =over 5
                                         <list indent="5">
  =item 1.                          =    <item>1.</item>
                                         <p>I like pie.</p>
  I want pie.                            </list>


  Mmmmmmm.                               <p>Mmmmmmm.
  Glorious I<italic pie>,                Glorious <i>italic pie</i>,
  C<codic pie>, F<filed pie>,       =    <c>codic pie</c>, <f>filed pie</f>,
  B<boldened pie>, and even              <b>boldened pie</b>, and even
  even X<indexed pie>.                   even <x>indexed pie</x>.</p>
  And even S<nested unbroken        =    And even <s>nested unbroken
  I<italic B<boldened                    <i>italic <b>boldened
  C<codic pie>>>>!                       <c>codic pie</c></b></i></s>!
  See also L<rhubarb pie            =    <link page="Pie::Filling"
  filling|Pie::Filling/"Rhubarb">.       section="Rhubarb"
                                         >rhubarb pie filling</link>

(Formatting of L<...> elements where there's no L<text|...> is inconsistent across different POD renderers. I strongly advise that you always use the L<text|...> style.)

If you're unsure about a particular POD-format construct, run pod2pxml on it, and see what happens. Be sure to report any oddities to me.

Note that XML PIs and comments are currently ignored by translation to POD. If you want comments that survive round-tripping pxml2pod2pxml, then you'd probably better put them in a

  <for target="comments">Comments Here</for>

block. And remember that those can't occur in the middle of paragraphs.


There should be a Pod::Tree/Pod::Parser subclass that will deal with:

  =begin pxml
  =end pxml

and parse it as if it were POD-format, transparently.

Conversely, there should be improved facility for reading POD-format transparently as PXML.

Smarter support for E<...> in pxml2pod -- currently most high-bit characters just end up as E<number>.

Make the XML output indented?

(Optionally?) Collapse non-verbatim whitespace in pxml2pod? Also (optionally?) re-wrap?

Handling of XML namespaces? At least for skipping foreign-namespace elements? Tell me what you want.

Handling of different encodings? Allow specifying UTF-8 / Latin-1 POD to/from UTF-8 / Latin-1 PXML?


perlpod documents POD-format.

Pod::Tree is the class that I use for parsing POD-format.

XML::Parser is the class that I use for parsing XML.

Pod::Parser is a different POD-format parser class.

Pod::XML is Matt Sergeant's approach to this, and it has a quite different doctype.

I once wrote Pod::HTML2POD, which is much much crazier inside than this module is. After I while, I figure that if I could (effectively!) convert HTML into POD, why not XML? And seeing Matt Sergeant's Pod::XML module got me going.


Copyright (c) 2001 Sean M. Burke. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.


Sean M. Burke,

1 POD Error

The following errors were encountered while parsing the POD:

Around line 51:

You can't have =items (as at line 57) unless the first thing after the =over is an =item