The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::XPathScript::Processor - the XML transformation engine in XML::XPathScript

SYNOPSIS

In a stylesheet ->{testcode} sub for e.g. Docbook's <ulink> tag:

      my $url = findvalue('@url',$self);
      if (findnodes("node()", $self)) {
         # ...
                $t->{pre}=qq'<a href="$url">';
                $t->{post}=qq'</a>';
                return DO_SELF_AND_KIDS;
      } else {
                $t->{pre}=qq'<a href="$url">$url</a>';
                $t->{post}=qq'';
                return DO_SELF_ONLY;
      };

At the stylesheet's top-level one often finds:

   <%= apply_templates() %>

DESCRIPTION

The XML::XPathScript distribution offers an XML parser glue, an embedded stylesheet language, and a way of processing an XML document into a text output. This package implements the latter part: it takes an already filled out $t template hash and an already parsed XML document (which come from XML::XPathScript behind the scenes), and provides a simple API to implement stylesheets. In particular, the "apply_templates" function triggers the recursive expansion of the whole XML document when used as shown in "SYNOPSIS".

XPathScript Language Functions

All of these functions are intended to be called solely from within the ->{testcode} templates or <% %> or <%= %> blocks in XPathScript stylesheets. They are automatically exported to both these contexts.

    DO_SELF_AND_KIDS, DO_SELF_ONLY, DO_NOT_PROCESS, DO_TEXT_AS_CHILD

    Symbolic constants evaluating respectively to 1, -1, 0 and 2, to be used as mnemotechnic return values in ->{testcode} routines instead of the numeric values which are harder to remember. Specifically:

    DO_SELF_AND_KIDS

    tells XML::XPathScript::Processor to render the current node as $t->{pre}, followed by the result of the call to "apply_templates" on the subnodes, followed by $t->{post}.

    DO_SELF_ONLY

    tells XML::XPathScript::Processor to render the current node simply as $t->{pre}, followed by $t->{post}.

    DO_NOT_PROCESS

    tells XML::XPathScript::Processor to render the current node as the empty string.

    DO_TEXT_AS_CHILD

    only meaningful for text nodes. When this value is returned, XML::XPathScript::Processor pretends that the text is a child of the node, which basically means that $t->{pre} and $t->{post} will frame the text instead of replacing it.

    E.g.

            $t->{pre} = '<text/>';
            #  will do <foo>bar</foo>  =>  <foo><text/></foo>
    
    
            $t->{pre} = '<t>';
            $t->{post} =  '</t>';
            $t->{testcode} = sub{ DO_TEXT_AS_CHILD };
            #  will do <foo>bar</foo>  =>  <foo><t>bar</t></foo>
    findnodes($path)
    findnodes($path, $context)

    Returns a list of nodes found by XPath expression $path, optionally using $context as the context node (default is the root node of the current document). In scalar context returns a NodeSet object (but you do not want to do that, see "XPath scalar return values considered harmful" in XML::XPathScript).

    findvalue($path)
    findvalue($path, $context)

    Evaluates XPath expression $path and returns the resulting value. If the path returns one of the "Literal", "Numeric" or "NodeList" XPath types, the stringification is done automatically for you using "xpath_to_string".

    xpath_to_string($blob)

    Converts any XPath data type, such as "Literal", "Numeric", "NodeList", text nodes, etc. into a pure Perl string (UTF-8 tainted too - see "is_utf8_tainted"). Scalar XPath types are interpreted in the straightforward way, DOM nodes are stringified into conform XML, and NodeList's are stringified by concatenating the stringification of their members (in the latter case, the result obviously is not guaranteed to be valid XML).

    See "XPath scalar return values considered harmful" in XML::XPathScript on why this is useful.

    findvalues($path)
    findvalues($path, $context)

    Evaluates XPath expression $path as a nodeset expression, just like "findnodes" would, but returns a list of UTF8-encoded XML strings instead of node objects or node sets. See also "XPath scalar return values considered harmful" in XML::XPathScript.

    findnodes_as_string($path)
    findnodes_as_string($path, $context)

    Similar to "findvalues" but concatenates the XML snippets. The result obviously is not guaranteed to be valid XML.

    matches($node, $path)
    matches($node, $path, $context)

    Returns true if the node matches the path (optionally in context $context)

    apply_templates()
    apply_templates($xpath)
    apply_templates($xpath, $context)
    apply_templates(@nodes)

    This is where the whole magic in XPathScript resides: recursively applies the stylesheet templates to the nodes provided either literally (last invocation form) or through an XPath expression (second and third invocation forms), and returns a string concatenation of all results. If called without arguments at all, renders the whole document (same as apply_templates("/")).

    Calls to apply_templates() may occur both implicitly (at the top of the document, and for rendering subnodes when the templates choose to handle that by themselves), and explicitly (because testcode routines require the XML::XPathScript::Processor to "DO_SELF_AND_KIDS").

    If appropriate care is taken in all templates (especially the testcode routines and the text() template), the string result of apply_templates need not be UTF-8 (see "binmode" in XML::XPathScript): it is thus possible to use XPathScript to produce output in any character set without an extra translation pass.

    call_template($node, $t, $templatename)

    EXPERIMENTAL - allows testcode routines to invoke a template by name, even if the selectors do not fit (e.g. one can apply template B to an element node of type A). Returns the stylesheeted string computed out of $node just like "apply_templates" would.

    is_element_node ( $object )

    Returns true if $object is an element node, false otherwise.

    is_text_node ( $object )

    Returns true if $object is a "true" text node (not a comment node), false otherwise.

    is_comment_node ( $object )

    Returns true if $object is an XML comment node, false otherwise.

    is_pi_node ( $object )

    Returns true iff $object is a processing instruction node.

    is_nodelist ( $object )

    Returns true if $node is a node list (as returned by "findnodes" in scalar context), false otherwise.

    is_utf8_tainted($string)

    Returns true if Perl thinks that $string is a string of characters (in UTF-8 internal representation), and false if Perl treats $string as a meaningless string of bytes.

    The dangerous part of the story is when concatenating a non-tainted string with a tainted one, as it causes the whole string to be re-interpreted into UTF-8, even the part that was supposedly meaningless character-wise, and that happens in a nonportable fashion (depends on locale and Perl version). So don't do that - and use this function to prevent that from happening.

    get_xpath_of_node($node)

    Returns an XPath string that points to $node, from the root. Useful to create error messages that point at some location in the original XML document.

BUGS

Right now XML::XPathScript::Processor is just an auxillary module to XML::XPathScript which should not be called directly: in other words, XPathScript's XML processing engine is not (yet) properly decoupled from the stylesheet language parser, and thus cannot stand alone.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 769:

You forgot a '=back' before '=head1'