Data::Edit::Xml - Edit data held in xml format
Transform some DocBook xml into Dita:
use Data::Edit::Xml; # Docbook say STDERR Data::Edit::Xml::new(<<END) <sli> <li> <p>Diagnose the problem</p> <p>This can be quite difficult</p> <p>Sometimes impossible</p> </li> <li> <p><pre>ls -la</pre></p> <p><pre> drwxr-xr-x 2 phil phil 4096 Jun 15 2016 Desktop drwxr-xr-x 2 phil phil 4096 Nov 9 20:26 Downloads </pre></p> </li> </sli> END # Transform to Dita ->by(sub {my ($o, $p) = @_; if ($o->at(qw(pre p li sli)) and $o->isOnlyChild) {$o->change($p->isFirst ? qw(cmd) : qw(stepresult)); $p->unwrap; } elsif ($o->at(qw(li sli)) and $o->over(qr(\Ap( p)+\Z))) {$_->change($_->isFirst ? qw(cmd) : qw(info)) for $o->contents; } }) ->by(sub {my ($o) = @_; $o->change(qw(step)) if $o->at(qw(li sli)); $o->change(qw(steps)) if $o->at(qw(sli)); $o->id = 's'.($o->position+1) if $o->at(qw(step)); $o->id = 'i'.($o->index+1) if $o->at(qw(info)); $o->wrapWith(qw(screen)) if $o->at(qw(CDATA stepresult)); }) # Print ->prettyString;
Produces:
<steps> <step id="s1"> <cmd>Diagnose the problem</cmd> <info id="i1">This can be quite difficult</info> <info id="i2">Sometimes impossible</info> </step> <step id="s2"> <cmd>ls -la</cmd> <stepresult> <screen> drwxr-xr-x 2 phil phil 4096 Jun 15 2016 Desktop drwxr-xr-x 2 phil phil 4096 Nov 9 20:26 Downloads </screen> </stepresult> </step> </steps>
Create a parse tree
S Construct a parse tree from a file or a string
New parse - call this method statically as in Data::Edit::Xml::new(file or string) or with no parameters and then use "input", "inputFile", "inputString", "errorFile" to provide specific parameters for the parse, then call "parse" to perform the parse and return the parse tree
1 $fileNameOrString File name or string
The attributes of this node, see also: "Attributes". The frequently used attributes: class, id, href, outputclass can be accessed by an lvalue method as in: $node->id = 'c1'
Conditional strings attached to a node, see "Conditions"
Content of command: the nodes immediately below this node in the order in which they appeared in the source text, see also "Contents"
Indexes to sub commands by tag in the order in which they appeared in the source text
The labels attached to a node to provide addressability from other nodes, see: "Labels".
Error listing file. Use this parameter to explicitly set the name of the file that will be used to write an parse errors to, by default this file is named: zzzParseErrors/out.data
Source file of the parse if this is the parser node. Use this parameter to explicitly set the file to be parsed.
Source of the parse if this is the parser node. Use this parameter to specify some input either as a string or as a file name for the parser to convert into a parse tree
Source string of the parse if this is the parser node. Use this parameter to explicitly set the string to be parsed.
Parent node of this node or undef if the root node. See also "Traversal" and "Navigation". Consider as read only.
Parser details: the root node of a tree is the parse node for that tree. Consider as read only.
Tag name for this node, see also "Traversal" and "Navigation". Consider as read only.
Text of this node but only if it is a text node, i.e. the tag is cdata() <=> "isText" is true
The name of the tag to be used to represent text - this tag must not also be used as a command tag otherwise chaos will occur
Parse input xml
1 $p Parser created by L</new>
Build a tree representation of the parsed xml which can be easily traversed to look for things
1 $parent The parent node 2 $parse The remaining parse
This is a private method.
Construct a parse tree node by node
Create a new text node
1 undef Any reference to this package 2 $text Content of new text node
Create a new non text node
1 undef Any reference to this package 2 $command The tag for the node 3 %attributes Attributes as a hash
Create a new tree - this is a static method
1 $command The name of the root node in the tree 2 %attributes Attributes of the root node in the tree as a hash
Remove a leaf node from the parse tree and make it into its own parse tree
1 $node Leaf node to disconnect
Index the children of a node so that we can access them by tag and number
1 $node Node to index
< > " with < > " Larry Wall's excellent Xml parser unfortunately replaces < > " & etc. with their expansions in text by default and does not seem to provide an obvious way to stop this behavior, so we have to put them back gain using this method. Worse, we cannot decide whether to replace & with & or leave it as is: consequently you might have to examine the instances of & in your output text and guess based on the context.
1 $string String to be edited
Count the number of tags in a parse tree
1 $node Parse tree
Returns a renewed copy of the parse tree: use this method if you have added nodes via the "Put as text" methods and wish to reprocess them
Return a clone of the parse tree: use this method if you want to make temporary changes to a parse tree
Decide whether two parse trees are equal or not
1 $node1 Parse tree 1 2 $node2 Parse tree 2
Save a copy of the parse tree to a file which can be restored and return the saved node
1 $node Parse tree 2 $file File
Return a parse tree from a copy saved in a file by "save" - this is a static method so call it as Data::Edit::Xml::lint(file name)
1 $file File
Create a string representation of the parse tree with optional selection of nodes via conditions
Print the parse tree
Return a string representing a node of a parse tree and all the nodes below it
1 $node Start node
Return a quoted string representing a parse tree a node of a parse tree and all the nodes below it
Return a string representing a node of a parse tree with all the id attributes replaced with the labels attached to each node
Return a quoted string representing a node of a parse tree with all the id attributes replaced with the labels attached to each node
Return a string representing all the nodes below a node of a parse tree
Return a readable string representing a node of a parse tree and all the nodes below it
1 $node Start node 2 $depth Depth
Return a readable string representing a node of a parse tree and all the nodes below it with the text fields wrapped with <CDATA>...</CDATA>
Return a readable string representing all the nodes below a node of a parse tree - infrequent use and so capitalized to avoid being presented as an option by Geany
Print a subset of the the parse tree determined by the conditions attached to it
Return a string representing a node of a parse tree and all the nodes below it subject to conditions to select or reject some nodes
1 $node Start node 2 @conditions Conditions to be regarded as in effect
Add conditions to a node and return the node
1 $node Node 2 @conditions Conditions to add
Delete conditions applied to a node and return the node
Return a list of conditions applied to a node
1 $node Node
Get or set attributes
Return the value of an attribute of the current node as an assignable value
1 $node Node in parse tree 2 $attribute Attribute name
Return the values of the specified attributes of the current node
1 $node Node in parse tree 2 @attributes Attribute names
Return the number of attributes in the specified node
1 $node Node in parse tree
Set the value of an attribute in a node and return the node
1 $node Node in parse tree 2 %values (attribute name=>new value)*
Delete the attribute, optionally checking its value first and return the node
1 $node Node 2 $attr Attribute name 3 $value Optional attribute value to check first
Delete any attributes mentioned in a list without checking their values and return the node
1 $node Node 2 @attrs Attribute name
Change the name of an attribute regardless of whether the new attribute already exists and return the node
1 $node Node 2 $old Existing attribute name 3 $new New attribute name
Change the name of an attribute unless it has already been set and return the node
Change the name and value of an attribute regardless of whether the new attribute already exists and return the node
1 $node Node 2 $old Existing attribute name 3 $oldValue Existing attribute value 4 $new New attribute name 5 $newValue New attribute value
Change the name and value of an attribute unless it has already been set and return the node
Traverse the parse tree
Post-order traversal of a parse tree or sub tree and return the specified starting node
1 $node Starting node 2 $sub Sub to call for each sub node 3 @context Accumulated context
Reverse post-order traversal of a parse tree or sub tree and return the specified starting node
Pre-order traversal down through a parse tree or sub tree and return the specified starting node
Reverse pre-order traversal down through a parse tree or sub tree and return the specified starting node
Traverse parse tree visiting each node twice and return the specified starting node
1 $node Starting node 2 $before Sub to call when we meet a node 3 $after Sub to call we leave a node 4 @context Accumulated context
Contents of the specified node
Return all the nodes contained by this node either as an array or as a reference to such an array
Return all the nodes following this node at the level of this node
Return all the nodes preceding this node at the level of this node
Return a string containing the tags of all the nodes contained by this node separated by single spaces
Return a string containing the tags of all the nodes following this node separated by single spaces
# Return a string containing the tags of all the nodes preceding this node separated by single spaces
Return the index of a node in its parent's content
Return the index of a node in its parent index
Return the count of the number of the specified tag types present immediately under a node
1 $node Node 2 @names Possible tags immediately under the node
Return the count the number of instances of the specified tags under the specified node, either by tag in array context or in total in scalar context
Confirm that this is a text node
1 $node Node to test
Confirm that this is a text node and that it is blank
Move around in the parse tree
Return a sub node under the specified node as directed by the search specification: (index position?)* where index is the kind of tag to be chosen and position is the optional position within the index. Position defaults to zero if not specified. Position can also be negative to index back from the top of the index array.
1 $node Node 2 @position Search specification
Example:
$a->get(qw a b -1))
would get the last b node under the first a node if such a node exists.
Use getX to execute get but die 'get' instead of returning undef
Return an array of all the nodes with the specified tag below the specified node
1 $node Node 2 $tag Tag
Return the first node below this node
Use firstNonBlank to skip a (rare) initial blank text CDATA. Use firstNonBlankX to die rather then receive a returned undef result.
Return the first node matching one of the named tags under the specified node
1 $node Node 2 @tags Tags to search for
Use firstInX to execute firstIn but die 'firstIn' instead of returning undef
Return the first node encountered in the specified context in a depth first post-order traversal of the parse tree
1 $node Node 2 @context Array of tags specifying context
Use firstContextOfX to execute firstContextOf but die 'firstContextOf' instead of returning undef
Return the last node below this node
Use lastNonBlank to skip a (rare) initial blank text CDATA. Use lastNonBlankX to die rather then receive a returned undef result.
Use lastInX to execute lastIn but die 'lastIn' instead of returning undef
Return the last node encountered in the specified context in a depth first reverse pre-order traversal of the parse tree
Use lastContextOfX to execute lastContextOf but die 'lastContextOf' instead of returning undef
Return the node next to the specified node
Use nextNonBlank to skip a (rare) initial blank text CDATA. Use nextNonBlankX to die rather then receive a returned undef result.
Return the next node matching one of the named tags
Use nextInX to execute nextIn but die 'nextIn' instead of returning undef
Return the node previous to the specified node
Use prevNonBlank to skip a (rare) initial blank text CDATA. Use prevNonBlankX to die rather then receive a returned undef result.
Return the next previous node matching one of the named tags
Use prevInX to execute prevIn but die 'prevIn' instead of returning undef
Return the first ancestral node that matches the specified context
1 $node Start node 2 @tags Tags identifying context
Use uptoX to execute upto but die 'upto' instead of returning undef
Confirm that the node has the specified ancestry
1 $node Starting node 2 @context Ancestry
Return a string containing the tag of this node and its ancestors separated by single spaces
Confirm that this node is the first node under its parent
Confirm that this node is the last node under its parent
Confirm that this node is the only node under its parent
Confirm that this node is empty, that is: this node has no content, not even a blank string of text
Confirm that the string representing the tags at the level below this node match a regular expression
1 $node Node 2 $re Regular expression
Confirm that the string representing the tags following this node match a regular expression
Confirm that the string representing the tags preceding this node match a regular expression
Edit the data in the parse tree
Change the structure of the parse tree
Change the name of a node in an optional tag context and return the node
1 $node Node 2 $name New name 3 @tags Tags defining the context
Use changeX to execute change but die 'change' instead of returning undef
Wrap the original node in a new node forcing the original node down deepening the parse tree; return the new wrapping node
1 $old Node 2 $tag Tag for new node 3 %attributes Attributes for new node
Wrap the original node in a sequence of new nodes forcing the original node down deepening the parse tree; return the array of wrapping nodes
1 $node Node to wrap 2 @tags Tags to wrap the node with - with the uppermost tag rightmost
Wrap the content of the original node in a sequence of new nodes forcing the original node up deepening the parse tree; return the array of wrapping nodes
Wrap the content of a node in a new node, the original content then contains the new node which contains the original node's content; returns the new wrapped node
Unwrap a node by inserting its content into its parent at the point containing the node; returns the parent node
1 $node Node to unwrap
Replace a node (and all its content) with a new node (and all its content) and return the new node
1 $old Old node 2 $new New node
Replace a node (and all its content) with a new text node and return the new node
1 $old Old node 2 $text Text of new node
Replace a node (and all its content) with a new blank text node and return the new node
1 $old Old node
Move nodes around in the parse tree
Cut out a node - remove the node from the parse tree and return the node so that it can be put else where
1 $node Node to cut out
Place the new node at the front of the content of the original node and return the new node
1 $old Original node 2 $new New node
Place the new node at the end of the content of the original node and return the new node
Place the new node just after the original node in the content of the parent and return the new node
Place the new node just before the original node in the content of the parent and return the new node
Split the content of a node by moving nodes to preceding or following nodes to a preceding or following node
Concatenate two successive nodes and return the target node
1 $target Target node to replace 2 $source Node to concatenate
Concatenate preceding and following nodes that have the same tag as the specified node and return the specified node
1 $node Concatenate around this node
Move the specified node and all its preceding nodes to a newly created node preceding this node's parent and return the new node (mm July 31, 2017)
1 $old Move this node and its preceding nodes 2 $new The name of the new node
Move all the nodes preceding a specified node to a newly created node preceding this node's parent and return the new node
1 $old Move all the nodes preceding this node 2 $new The name of the new node
Move the specified node and all its following nodes to a newly created node following this node's parent and return the new node
1 $old Move this node and its following nodes 2 $new The name of the new node
Move all the nodes following a node to a newly created node following this node's parent and return the new node
1 $old Move the nodes following this node 2 $new The name of the new node
Add text to the parse tree
Add a new text node first under a parent and return the new text node
1 $node The parent node 2 $text The string to be added which might contain unparsed Xml as well as text
Add a new text node last under a parent and return the new text node
Add a new text node following this node and return the new text node
Additional labels for a node which will be recognized by Data::Edit::Xml::Lint
Add the named labels to the specified node and return that node
1 $node Node in parse tree 2 @labels Names of labels to add
Return the count of the number of labels at a node
Return the names of all the labels set on a node
Delete the specified labels in the specified node and return that node
1 $node Node in parse tree 2 @labels Names of the labels to be deleted
Delete all the labels in the specified node and return that node
Copy all the labels from the source node to the target node and return the source node
1 $source Source node 2 $target Target node
Move all the labels from the source node to the target node and return the source node
Operator access to methods use the assign versions to avoid error messages about pointless expression in a void context. Use the non assign versions to return the results of the underlying method call. Thus '/' returns the wrapping node, whilst '/=' does not.
-c : clone, -p : pretty string, -r : renew, -s : string, -t : tag.
1 $node Node 2 $op Monadic operator
-p $x
to print node $x as a pretty string
@{} : content of a node.
grep {...} @$x
to search the contents of node $x
>>= : Write a parse tree out on a file.
1 $node Node 2 $file File
$x >>= *STDERR
<= : Check that a node is in the context specified by the referenced array of words.
1 $node Node 2 $context Reference to array of words specifying the parents of the desired node
$c <= [qw(c b a)]
to confirm that node $c has tag 'c', parent 'b' and grand parent 'a'
+ or += : put a node or string first under a node.
1 $node Node 2 $text Node or text to place first under the node
my $f = $a + '<p>first</p>'
- : put a node or string last under a node.
1 $node Node 2 $text Node or text to place last under the node
my $l = $a + '<p>last</p>'
> : put a node or string after the current node.
1 $node Node 2 $text Node or text to place after the first node
my $n = $a > '<p>next</p>'
< : put a node or string before the current node,
1 $node Node 2 $text Node or text to place before the first node
my $p = $a < '<p>next</p>'
x= : Traverse a parse tree in pre-order.
1 $node Parse tree 2 $code Code to execute against each node
$a x= sub {say -s $_}
to print all the parse trees in a parse tree
>> : Search for a node via a specification provided as a reference to an array of words each number. Each word represents a tag name, each number the index of the previous tag or zero by default.
1 $node Node 2 $get Reference to an array of search parameters
my $f = $a >> [qw(aa 1 bb)]
to find the first bb under the second aa under $a
% : Get the value of an attribute of this node.
1 $node Node 2 $attr Reference to an array of words and numbers specifying the node to search for.
my $a = $x % 'href'
to get the href attribute of the node at $x
+= : Set the tag for a node.
$a += 'tag'
to change the tag to 'tag' at the node $a
-= : Set the id for a node.
1 $node Node 2 $id Id
$a -= 'id'
to change the id to 'id' at node $a
/ or /= : Wrap node with a tag, returning or not returning the wrapping node.
$x /= 'aa'
to wrap node $x with a node with a tag of 'aa'
* or *= : Wrap content with a tag, returning or not returning the wrapping node.
$x *= 'aa'
to wrap the content of node $x with a node with a tag of 'aa'
-- : Cut out a node.
--$x
to cut out the node $x
++ : Unwrap a node.
++$x
to unwrap the node $x
Debugging methods
Print the attributes of a node
1 $node Node whose attributes are to be printed
Print the attributes of a node replacing the id with the labels
Check the parent pointers are correct in a parse tree
1 $x Parse tree
Check that every node has a parser
Replace new line with N
1 $s String
addConditions
addLabels
after
at
attr :lvalue
attrCount
attributes
attrs
before
by
byReverse
c
cdata
change
changeAttr
changeAttrValue
changeX
checkParentage
checkParser
clone
concatenate
concatenateSiblings
conditions
content
contentAsTags
contentBefore
contentBeforeAsTags
contentBeyond
contentBeyondAsTags
contents
contentString
context
copyLabels
count
countLabels
cut
deleteAllLabels
deleteAttr
deleteAttrs
deleteConditions
deleteLabels
disconnectLeafNode
down
downReverse
equals
errorsFile
first
firstContextOf
firstContextOfX
firstIn
firstInX
firstNonBlank
firstNonBlankX
get
getLabels
getX
index
indexes
indexNode
input
inputFile
inputString
isBlankText
isEmpty
isFirst
isLast
isOnlyChild
isText
labels
last
lastContextOf
lastContextOfX
lastIn
lastInX
lastNonBlank
lastNonBlankX
listConditions
moveLabels
new
newTag
newText
newTree
next
nextIn
nextInX
nextNonBlank
nextNonBlankX
nn
opAttr
opBy
opContents
opContext
opCut
opGet
opOut
opPutFirst
opPutLast
opPutNext
opPutPrev
opSetId
opSetTag
opString
opUnWrap
opWrapContentWith
opWrapWith
over
parent
parse
parser
position
present
PrettyContentString
prettyString
prettyStringShowingCDATA
prev
prevIn
prevInX
prevNonBlank
prevNonBlankX
printAttributes
printAttributesReplacingIdsWithLabels
putFirst
putFirstAsText
putLast
putLastAsText
putNext
putNextAsText
putPrev
putPrevAsText
renameAttr
renameAttrValue
renew
replaceSpecialChars
replaceWith
replaceWithBlank
replaceWithText
restore
save
setAttr
splitBack
splitBackEx
splitForwards
splitForwardsEx
string
stringQuoted
stringReplacingIdWithLabels
stringReplacingIdWithLabelsQuoted
stringWithConditions
tag
tags
text
through
tree
unwrap
upto
uptoX
wrapContentWith
wrapDown
wrapUp
wrapWith
This module is written in 100% Pure Perl and, thus, it is easy to read, use, modify and install.
Standard Module::Build process for building and installing modules:
perl Build.PL ./Build ./Build test ./Build install
philiprbrenan@gmail.com
http://www.appaapps.com
Copyright (c) 2016-2017 Philip R Brenan.
This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.
To install Data::Edit::Xml, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Edit::Xml
CPAN shell
perl -MCPAN -e shell install Data::Edit::Xml
For more information on module installation, please visit the detailed CPAN module installation guide.