The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Decl::Node - implements a node in a declarative structure.

VERSION

Version 0.08

SYNOPSIS

Each node in a Decl structure is represented by one of these objects. Specific semantics modules subclass these nodes for each of their components.

defines(), tags_defined()

Called by Decl during import, to find out what xmlapi tags this plugin claims to implement. This is a class method, and by default we've got nothing.

The wantsbody function governs how iterator works.

overloaded ""

The node class returns tag(class) when expressed as a string.

refaddr_or_undef

This is a cheap trick we're going to use for inserting children after other children.

new()

The constructor for a node takes either one or an arrayref containing two texts. If one, it is the entire line-and-body of a node; if the arrayref, the line and the body are already separated. If they're delivered together, they're split before proceeding.

The line and body are retained, although they may be further parsed later. If the body is parsed, its text is discarded and is reconstructed if it's needed for self-description. (This can be suppressed if a non-standard parser is used that has no self-description facility.)

The node's tag is the first word in the line. The tag determines everything pertaining to this entire section of the application, including how its contents are parsed.

splittag - class method

This splits the flag off a tag (e.g. template. => template + .)

tag(), flag(), is($tag), name(), names(), line(), hasbody(), body(), elements(), truenodes(), payload()

Accessor functions.

nodes($flavor)

The true nodes (truenodes() of a parent are the actual structural children that aren't comments. This function returns the functional nodes - by using a grouping structure, the results of macros, selects, and inserts can appear to be rooted in the parent at precisely the place their progenitor is located.

If $flavor is specified, nodes() returns only those children with tags equal to $flavor; otherwise, all functional children are returned.

content_nodes($flavor)

The content nodes of a parent are the functional nodes returned by nodes minus any that have the flag ':'. This permits nodes to be split into "meta" specifications and child specifications for a given parent. An example might be providing a "style:" parameter for a text structure, or a "path:" parameter for a directory.

parent(), ancestry()

A list of all the tags of nodes above this one, culminating in this one's tag, returned as an arrayref.

parameter($p), option($o), parmlist(), optionlist(), parameter_n(), option_n(), label(), parser(), code(), gencode(), errors(), bracket(), comment()

More accessor functions.

plist(@parameters)

Given a list of parameters, returns a hash (not a hashref) of their values, first looking in the parameters, then looking for children of the same name and returning their labels if necessary. This allows us to specify a parameter for a given object either like this:

   object (parm1=value1, parm2 = value2)
   

or like this:

   object
      parm1 "value1"
      parm2 "value2"
      

It just depends on what you find more readable at the time. For this to work during payload build, though, the children have to be built first, which isn't the default - so you have to call $self->build_children before using this in the payload build.

This is really useful if you're wrapping a module that uses a hash to initialize its object. Like, say, LWP::UserAgent.

parm_css (parameter), set_css_values (hashref, parameter_string), prepare_css_value (hashref, name), get_css_value (hashref, name)

CSS is characterized by a sort of "parameter tree", where many parameters can be seen as nested in a hierarchy. Take fonts, for example. A font has a size, a name, a bolded flag, and so on. To specify a font, then, we end up with things like font-name, font-size, font-bold, etc. In CSS, we can also group those things together and get something like font="name: Times; size: 20", and that is equivalent to font-name="Times", font-size="20". See?

This function does the same thing with the parameters of a node. If you give it a name "font" it will find /font-*/ as well, and munge the values into the "font" value. It returns a hashref containing the entire hierarchy of these things, and it will also interpret any string-type parameters in the higher levels, e.g. font="size: 20; name: Times" will go into {size=>20, name=>'Times'}. Honestly, I love this way of handling parameters in CSS.

If you give a name "font-size" it will also find any font="size: 20" specification and retrieve the appropriate value.

It won't decompose multiple hierarchical levels starting from a string (e.g. something like font="size: {type: 3}" will not be parsed for font-size-type, because you'd need curly brackets or something anyway, and this ain't JSON, it's just simple CSS-like parameter addressing.

flags({flag=>numeric value, ...}), oflags({flag=>numeric value, ...})

A quick utility to produce an OR'd flag set from a list of parameter words. Pass it a hashref containing numeric values for a set of words, and you'll get back the OR'd sum of the flags found in the parameters. The flags function does this for the parameters (round parens) and the oflags function does the same for the options [square brackets].

list_parameter ($name)

Sometimes, instead of having e.g. position-x and position-y parameters, it's easier to have something like p=40 20 or dim=20x20. We can use the list_parameter function to obtain a list of any numbers separated by non-number characters. (Note that due to the line parser using commas to separate the parameters themselves, the separator can't be a comma. Unless you want to write a different line parser, in which case, go you!)

So the separator characters can be: !@#$%^&*|:;~x and space.

BUILDING STRUCTURE

load ($string, $after)

The load method loads declarative specification text into a node by calling the parser appropriate to the node. Multiple loads can be carried out, and will simply add to text already there.

The return value is the list of objects added to the target, if any.

macroinsert ($spec, $after)

This function adds structure to a given node at runtime that won't show up in the node's describe results. It is used by the macro system (hence the name) but can be used by other runtime structure modifiers that act more or less like macros. The idea is that this structure is meaningful at runtime but is semantically already accounted for in the existing definition, and should always be generated only at runtime.

replace_node ($old_node, $new_node)

There are times when dynamically changing semantics force us to reevaluate an existing node during the build phase. We use replace to replace the existing node with the newly interpeted variant. It works by actual pointer. If the old_name isn't found, nothing will happen.

Setting parts of a node: set_name($name), set_label($label), set_parmlist (@list), set_parameter($key, $value), set_optionlist (@list), set_option($key, $value)

These are handy for building a node from scratch.

The build process: build(), preprocess(), preprocess_line(), decode_line(), parse_body(), build_payload(), build_children(), add_to_parent(), post_build()

The build function parses the body of the tag, then builds the payload it defines, then calls build on each child if appropriate, then adds itself to its parent. It provides the hooks preprocess (checks for macro nature and expresses if so), parse_body (asks the application to call the appropriate parser for the tag), build_payload (does nothing by default), build_children (calls build on each element), and add_to_parent (does nothing by default).

If this tag corresponds to a macro, then substitution takes place before parsing, in the preprocess step.

STRUCTURE ACCESS

find($locator), findbyname($locator)

Given a node, finds a descendant using a simple XPath-like language. Once you build a recursive-descent parser facility into your language, this sort of thing gets a whole lot easier. The find function looks by tag; the findbyname treats the tag as a type and thus the name as the search property.

Generation separators are '.', '/', or ':' depending on how you like it. Offsets by number are in round brackets (), while finding children by name is done with square brackets []. Square brackets [name] find tags named "name". Square brackets [name name2] find name lists (which nodes can have, yes), and square brackets with an = or =~ can also search for nodes by other values.

You can also pass the results of a parse (the arrayref tree) in as the path; this allows you to build the parse tree using other tools instead of forcing you to build a string (it also allows a single parse result to be used recursively without having to parse it again).

match($pathelement), matchbyname($pathelement)

Returns a true value if the node matches the path element specified; otherwise, returns a false value.

first($nodename)

Given a node, finds a descendant with the given tag anywhere in its descent. Uses the same path notation as find.

search($nodename)

Given a node, finds all descendants with the given tag.

search_data($type)

Given a node, finds all its descendents that match the given type in either name or tag. If the type ends in a ':', will only return meta nodes.

describe, myline, describe_content

The describe function is used to get our code back out so we can reparse it later if we want to. It includes the body and any children. The myline function just does that without the body and children (just the actual line). The describe_content function does just the body and children (without the actual line).

We could also use this to check the output of the parser, which notoriously just stops on a line if it encounters something it's not expecting.

sketch (), sketch_c(), sketch_d()

Returns a thin structure reflecting the nodal structure of the node in question:

   ['tag',
     [['child1', []],
      ['child2', []]]]
      

Like that. I'm building it for testing purposes, but it might be useful for something else, too.

The sketch_c variant also includes the class of each node, and the sketch_d variant runs the whole thing through Dumper first.

mylocation()

This reports the node's own location in the code tree.

go($item)

For callable nodes, this is one way to call them. The default is to call the go methods of all the children of the node, in sequence. The last result is returned as our result (this means that the overall tree may have a return value if you set things up right).

closure(...)

For callable nodes, this is the other way to call them; it returns the closure created during initialization. Note that the default closure is really boring.

iterate()

Returns an Iterator::Simple iterator over the body of the node. If the body is a text body, each call returns a line. If the body is a bracketed code body, it is executed to return an iterable object. Yes, this is neat.

If we're a parser macro, we'll run our special parser over the body instead of the normal parser.

TODO: shouldn't this be recursive for structured nodes?

TODO: might want to do something clever with a code ref tag. (I.e. if the tag is a reference but also has a code block, perhaps evaluate the code block to figure out the reference or something. This might be a plate of beans.)

text()

This returns a tokenstream on the node's body permitting a consumer to read a series of words interspersed with formatting commands. The formatting commands are pretty loose - essentially, "blankline" is the only one. Punctuation is treated as letters in words; that is, only whitespace is elided in the tokenization process.

If the node has been parsed, it probably doesn't have a body any more, so this will return a blank tokenstream. On the other hand, if the node is callable, it will be called, and the result will be used as input to the tokenstream - same rules as iterate above.

express(), content()

The content function returns the iterated content from iterate(), assembled into lines with as few newlines as possible. The express function is normally an alias for content.

event_context

If the node is an event context (e.g. a window or frame or dialog), this should return the payload of the node. Otherwise, it returns the event_context of the parent node.

root

Returns the parent - all nodes do this. The top node at Decl returns itself.

error

Error handling is the part of programming I'm worst at. But you just have to bite the bullet and address your weaknesses, so here is an error marker function. If there's a problem with a node specification, this marks it. Later we'll do something sensible with it. TODO: something sensible.

find_data

The find_data function finds a data node starting at a given point in the tree. Right now, it's just going to look for nodes by name/tag, but more mature locators should follow eventually.

find_context (tag, name)

Here, we search for a node with a given name and tag in almost the same way as find_data - first searching our siblings, then our parent's siblings, and so on. Used to look for macro definitions, databases, whatever. If either the tag or the name is omitted, it won't be used for comparison (thus the first tag of any name or the first named tag of any type will be returned).

Note I said "almost". Any node that comes after the caller won't be considered context. (Neither will the caller itself.) Ditto the parent, grandparent, etc. What that means is that context has to appear in the source before the point where find_context is called.

find_ref (tag, name)

The find_ref function looks for tag-and-name combinations that don't have the "is_reference" flag set. It returns the first it finds. If either tag or name is undef, it ignores that spec.

deref ()

The deref function uses find_ref to dereference a reference tag. If the tag you give it isn't a reference, you'll just get that tag back. If it's a dangling reference, you'll get undef.

set(), get(), get_pair()

These provide a place for object constructors to stash useful information. The get function gets a parameter if the named user variable hasn't been set. It also allows the specification of a default value.

get_pair gets a pair of named values as an arrayref, with a single arrayref default if neither is found. The individual defaults are assumed to be 0.

VALUES

The value system in a Decl node is getting pretty darned complex. Essentially, though, each node has a value lookup hash that either has scalar values directly or closures that can be used as proxies for values found in other nodes. (For example, if a node is a macro instantiation, then mostly we're going to be referring to values in the definition, not in the instance. If a node hasn't explicitly defined a value but its parent has, then when we set that value we'll want to set it in the parent, not in the child. And so on.)

When we first want to use a given value in a node, we'll call "find_value". That will return a closure that can be called to get or set the value. If the value can't be set, the closure will simply have no effect. The closure will be stashed locally so that it need only be located once, and we're always assured of being able to access the same storage location for a given name.

find_value($var), with helper function get_value_closure

To find a value:

1. Return any previously located closure. 2. If we're a macro instantiation, look at the macro definition. 3. See if there's a local definition for the value; return it if so. 4. See if we have any local constant definitions (our children, evaluated as values). 5. Check our event context. 6. If we're still not in luck, ask our parent to do the same. 7. Otherwise, return "undefined". A set will then create a local variable if necessary.

The closure returned by get_value_closure has the same signature as the varhandlers used by the value tag. So weird as it sounds, the key and value are in parameters 2 and 3.

value($var), setvalue($var, $value)

Accesses the global application value named.

get_value($var)

Given the name of a value, we can find it in various places, which we look at in order:

- A set value in the node asked - Rinse and repeat for the node's parent.

Names starting with an asterisk find parts of the node itself: *name, *label, *parameter <n>, *option <n>, *content, and anything else I forgot and add later. A double asterisk gets the same values from the parent. Triple asterisk, grandparent, etc.

express_value($valuespec)

A full value spec pipes a given value through a series of filters:

   <value>[|<filter>]*
   

A filter is simply a function that takes one parameter. (This is an oversimplification: the filter can be given parameters that are space-delimited.)

If no lookup value is desired as a starting value, you can also just start the pipe with a filter/function call. marked with an exclamation mark:

  !<filter>[|<filter>]*
  

Clear? Clear.

register_varhandler ($event, $handler)

Registers a variable handler in the event context. If there is a handler registered for a name, it will be called instead of the normal hash read and write. This means you can attach active content to a variable, then treat it just like any other variable in your code.

subs()

Returns all our direct children named 'sub', plus the same thing from our parent. Our answers mask our parent's.

find_filter(filter), call_filter(filter, value)

Finds a filter by name from a given point in the tree and calls it with a set of parameters.

OUTPUT

write(), log(), output()

The write function is supported for any node; by default it simply passes its arguments up to its parent. The top of the tree will print everything to STDOUT - by default. At any point in the tree, though, a node may claim ownership of the output stream by having an option [output]; any write called below that node's parent will be written to that node's write. Obviously, this is a good way to use files.

The log function is exactly the same, except the default is to write to STDERR and the option to use is [log].

There is another difference: a file used as [output] will by default start from scratch ('w'), while a file used as [log] will append its material ('a'). Either is opened during build, and closed when the program closes.

If it's not in [output] or [log] mode, however, each call to write on a file is independent; the file is closed afterwards and no handle is kept around. This can be overridden with a (keepopen) parameter or a (>>) parameter for appending. (Any appending file will be opened for appending during build and closed when the program closes.)

If a file is in keepopen mode, the buffers are flushed after each write/log.

The output function defaults to write. For a macro definition, though, it is used to build the macro to be instantiated.

AUTHOR

Michael Roberts, <michael at vivtek.com>

BUGS

Please report any bugs or feature requests to bug-decl at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Decl. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

LICENSE AND COPYRIGHT

Copyright 2010 Michael Roberts.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.