Regexp::Parser::Objects - objects for Perl 5 regexes
This module contains the object definitions for Regexp::Parser.
All Regexp::Parser::* objects inherit from Regexp::Parser::__object__, the global object base class. All user-defined MyRx::* objects inherit from MyRx::__object__ first, then from the Regexp::Parser::* object of the same name, and finally from Regexp::Parser::__object__. Don't worry -- if you don't define a base class for your module's objects, or the object you create isn't a modification of a standard object, no warnings will be issued.
All nodes inherit from Regexp::Parser::__object__ the following methods:
The object's data. This might be an array reference (for a 'branch' node), another object (for a 'quant' node), or it might not exist at all (for an 'anchor' node).
The arguments to object() to create the ending node for this object. This is used by the walk() method. Typically, a capturing group's ender is a close node, any other assertion's ender is a tail node, and a character class's ender is an anyof_close node.
close
tail
anyof_close
The general family of this object. These are any of: alnum, anchor, anyof, anyof_char, anyof_class, anyof_range, assertion, branch, close, clump, digit, exact, flags, group, groupp, minmod, prop, open, quant, ref, reg_any.
The flag value for this object. This value is a number created by OR'ing together the flags that are enabled at the time.
Inserts this object into the tree. It returns a value that says whether or not it ended up being merged with the previous object in the tree.
Merges this node with the previous one, if they are of the same type. If it is called after $obj has been added to the tree, $obj will be removed from the tree. Most node types don't merge. Returns true if the node was merged with the previous one.
Whether this node is omitted from the parse tree. Certain objects do not need to appear in the tree, but are needed when inspecting the parsing, or walking the tree.
You can also set this attribute by passing a value.
The regex representation of this object. It includes the regex representation of any children of the object.
The raw representation of this object. It does not look at the children of the object, just itself. This is used primarily when inspecting the parsing of the regex.
The specific type of this object. See the object's documentation for possible values for its type.
The visual representation of this object. It includes the visual representation of any children of the object.
"Walks" the object. This is used to dive into the node's children when using a walker (see "Walking the Tree" in Regexp::Parser).
Objects may override these methods (as objects often do).
You can't use $obj->SUPER::method() inside the __object__ class, because __object__ doesn't inherit from anywhere. You want to go along the object's inheritance tree. Use Damian Conway's NEXT module instead. This module is standard with Perl 5.8.
$obj->SUPER::method()
All objects share the following attributes (accessible via $obj->{...}):
$obj->{...}
The parser object with which it was created.
The flags for the object.
The following attributes may also be set:
Whether this object has branches (like |).
|
The general family of this object.
The data or children of this object.
The direction of this object (for look-ahead/behind assertions). If less than 0, it is behind; otherwise, it is ahead.
Whether this object creates a deeper scope (like an OPEN).
Whether this object has a true/false branch (like the (?(...)T|F) assertion).
(?(...)T|F)
The maximum repetition count of this object (for quantifiers).
The minimum repetition count of this object (for quantifiers).
Whether this object is negated (like a look-ahead or a character class).
The capture group related to this object (like for OPEN and back references).
The flags specifically turned off for this object (for flag assertions and (?:...)).
(?:...)
Whether this object is omitted from the actual tree (like a CLOSE).
The flags specifically turned on for this object (for flag assertions and (?:...)).
The raw representation of this object.
The specific type of this object.
Whether this object goes into a shallower scope (like a CLOSE).
The visual representation of this object.
Whether this object does is zero-width (like an anchor).
If there is a method with the name of one of these attributes, it is imperative you use the method to access the attribute when outside the class, and it's a good idea to do so inside the class as well.
All objects are prefixed with Regexp::Parser::, but that is omitted here for brevity. The headings are object classes. The field "family" represents the general category into which that object falls.
This is very sparse. Future versions will have more complete documentation. For now, read the source (!).
Family: anchor
Types: bol (^), sbol (^ with /s on, \A), mbol (^ with /m on)
^
/s
\A
/m
Types: bound (\b), nbound (\B)
\b
\B
Neg: 1 if negated
Types: gpos (\G)
\G
Types: eol ($), seol ($ with /s on, \Z), meol ($ with /m on), eos (\z)
$
\Z
\z
Family: reg_any
Types: reg_any (.), sany (. with /s on), cany (\C)
.
\C
Family: alnum
Types: alnum (\w), nalnum (\W)
\w
\W
Family: space
Types: space (\s), nspace (\S)
\s
\S
Family: digit
Types: digit (\d), ndigit (\D)
\d
\D
Family: anyof
Types: anyof ([)
[
Data: array reference of anyof_char, anyof_range, anyof_class
Ender: anyof_close
Family: anyof_char
Types: anyof_char (X)
X
Data: actual character
Family: anyof_range
Types: anyof_range (X-Y)
X-Y
Data: array reference of lower and upper bounds, both anyof_char
Family: anyof_class
Types: via [:NAME:], [:^NAME:], \p{NAME}, \P{NAME}: alnum (\w, \W), alpha, ascii, cntrl, digit (\d, \D), graph, lower, print, punct, space (\s, \S), upper, word, xdigit; others are possible (Unicode properties and user-defined POSIX classes)
[:NAME:]
[:^NAME:]
\p{NAME}
\P{NAME}
Data: 'POSIX' if [:NAME:], [^:NAME:] (or other POSIX notations, like [=NAME=] and [.NAME.]); otherwise, reference to alnum, digit, space, or prop object
[^:NAME:]
[=NAME=]
[.NAME.]
Family: close
Types: anyof_close (] when in [...)
]
[...
Omitted
Family: prop
Types: name of property (\p{NAME}, \P{NAME}); any Unicode property defined by Perl or elsewhere
Family: clump
Types: clump (\X)
\X
Family: branch
Types: branch (|)
Data: array reference of array references, each representing one alternation, holding any number of objects
Branched
Family: exact
Types: exact (abc), exactf (abc with /i on)
abc
/i
Data: array reference of actual characters
Family: quant
Types: star (*), plus (+), curly (?, {n}, {n,}, {n,m})
*
+
?
{n}
{n,}
{n,m}
Data: one object
Family: group
Types: group ((?:, (?i-s:)
(?:
(?i-s:
Data: array reference of any number of objects
Ender: tail
Family: open
Types: open1, open2 ... openN (()
(
Ender: close
Types: close1, close2 ... closeN () when in (...)
)
(...
Types: tail () when not in (...)
Family: ref
Types: ref1, ref2 .. refN (\1, \2, etc.); reff1, reff2 .. reffN (\1, \2, etc. with /i on)
\1
\2
Family: assertion
Types: ifmatch ((?=), (?<=)
(?=)
(?<=
Dir: -1 if look-behind, 1 if look-ahead
Types: unlessm ((?!, (?<!)
(?!
(?<!
Types: suspend ((?>)
(?>
Types: ifthen ((?()
(?(
Data: array reference of two objects; first: ifmatch, unlessm, eval, groupp; second: branch
Family: groupp
Types: groupp1, groupp2 .. grouppN (1, 2, etc. when in (?()
1
2
Types: eval ((?{)
(?{
Data: string with contents of assertion
Types: logical ((??{)
(??{
Family: flags
Types: flags ((?i-s))
(?i-s)
Family: minmod
Types: minmod (? after quant)
Data: an object in the quant family
Regexp::Parser, Regexp::Parser::Handlers.
Jeff japhy Pinyan, japhy@perlmonk.org
japhy
Copyright (c) 2004 Jeff Pinyan japhy@perlmonk.org. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Regexp::Parser, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Regexp::Parser
CPAN shell
perl -MCPAN -e shell install Regexp::Parser
For more information on module installation, please visit the detailed CPAN module installation guide.