The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Graph::Easy::Marpa - A Marpa- and Set::FA::Element-based parser for Graph::Easy

Synopsis

Sample Code

        #!/usr/bin/env perl
        
        use strict;
        use warnings;
        
        use Graph::Easy::Marpa;
        
        use Getopt::Long;
        
        use Pod::Usage;
        
        # -----------------------------------------------
        
        my($option_parser) = Getopt::Long::Parser -> new();
        
        my(%option);
        
        if ($option_parser -> getoptions
        (
         \%option,
         'cooked_file=s',
         'description=s',
         'format=s',
         'help',
         'input_file=s',
         'logger=s',
         'maxlevel=s',
         'minlevel=s',
         'output_file=s',
         'parsed_tokens_file=s',
         'stt_file=s',
         'type=s',
        ) )
        {
                pod2usage(1) if ($option{'help'});
        
                # Return 0 for success and 1 for failure.
        
                exit Graph::Easy::Marpa -> new(%option) -> run;
        }
        else
        {
                pod2usage(2);
        }

This is shipped as scripts/gem.pl, although the shipped version has built-in help.

See also scripts/lex.pl and scripts/parse.pl.

Sample output

Unpack the distro and copy html/*.html and html/*.svg to your web server's doc root directory.

Then, point your browser at 127.0.0.1/index.html.

Or, hit http://savage.net.au/Perl-modules/html/graph.easy.marpa/index.html.

Modules

o Graph::Easy::Marpa

The current module, which documents the set of modules.

It uses Graph::Easy::Lexer, Graph::Easy::Parser and Graph::Easy::Marpa::GraphViz2 to render a Graph::Easy-syntax file into a (by default) *.svg file.

See scripts/gem.pl and scripts/gem.sh.

o Graph::Easy::Marpa::Lexer

See Graph::Easy::Marpa::Lexer.

Processes a raw Graph::Easy graph definition and outputs a cooked representation of that graph in a language which can be read by the parser.

See scripts/lex.pl and scripts/lex.sh.

o Graph::Easy::Marpa::Lexer::DFA

See Graph::Easy::Marpa::Lexer::DFA.

Wraps Set::FA::Element, which is what actually lexes the input Graph::Easy-syntax graph definition.

o Graph::Easy::Marpa::Parser

See Graph::Easy::Marpa::Parser.

Accepts a graph definition in the cooked language and builds a data structure representing the graph.

See scripts/parse.pl and scripts/parse.sh.

o Graph::Easy::Marpa::Utils

Code to help with testing.

Description

Graph::Easy::Marpa provides a Marpa-based parser for Graph::Easy-style graph definitions.

See "Data Files and Scripts" for details.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install Graph::Easy::Marpa as you would for any Perl module:

Run:

        cpanm Graph::Easy::Marpa

or run:

        sudo cpan Graph::Easy::Marpa

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = Graph::Easy::Marpa -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Graph::Easy::Marpa.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. maxlevel()]):

o cooked_file => $csv_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Lexer.

This file can be input to Graph::Easy::Marpa::Parser.

See also the 'parsed_tokens_file' key, below.

o description => $graph_description_string

Specify a string for the graph definition.

You are strongly encouraged to surround this string with '...' to protect it from your shell.

See also the 'input_file' key to read the graph from a file.

The 'description' key takes precedence over the 'input_file' key.

o dot_input_file => $file_name

Specify the name of a file that the rendering engine can write to, which will contain the input to dot (or whatever). This is good for debugging.

Default: ''.

If '', the file will not be created.

o format => $format_name

This is the format of the output file, to be created by the renderer.

Default is 'svg'.

o input_file => $graph_file_name

Read the graph definition from this file.

See also the 'description' key to read the graph from the command line.

The whole file is slurped in as 1 graph.

The first lines of the file can start with /^\s*#/, and will be discarded as comments.

The 'description' key takes precedence over the 'input_file' key.

o logger => $logger_object

Specify a logger object.

To disable logging, just set logger to the empty string.

The default value is an object of type Log::Handler which outputs to the screen.

This logger is passed to Graph::Easy::Marpa::Lexer, Graph::Easy::Marpa::Lexer::DFA, Graph::Easy::Marpa::Parser and Graph::Easy::Marpa::Renderer::GraphViz2.

o maxlevel => $level

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

The default 'maxlevel' is 'info'. A typical value is 'debug'.

o minlevel => $level

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

The default 'minlevel' is 'error'.

No lower levels are used.

o output_file => $output_file_name

If an output file name is supplied, and a rendering object is also supplied, then this call is made:

        $self -> renderer -> run(format => $self -> format, items => [$self -> items -> print], output_file => $file_name);

This is how the plotted graph is actually created.

o parsed_tokens_file => $token_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Parser.

See also the 'cooked_file' key, above.

o renderer => $renderer_object

This is the object whose run() method will be called to render the result of parsing the cooked file received from Graph::Easy::Marpa::Lexer.

The format of the parameters passed to the renderer are documented in "run(%arg)" in Graph::Easy::Marpa::Renderer::GraphViz2, which is the default value for this object.

o report_items => $Boolean

Calls "report()" in Graph::Easy::Marpa::Parser to report, via the log, the items recognized in the cooked file.

o stt_file => $stt_file_name

Specify which file contains the state transition table.

Default: ''.

The default value means the STT is read from the source code of Graph::Easy::Marpa::Lexer.

Candidate files are '', 'data/default.stt.csv' and 'data/default.stt.ods'.

The type of this file must be specified by the 'type' key.

o timeout => $seconds

Run the DFA for at most this many seconds.

Default: 3.

o type => $stt_file_type

Specify the type of the stt_file: '' for internal, csv for CSV, or ods for Open Office Calc spreadsheet.

Default is ''.

The default value means the STT is read from the source code of Graph::Easy::Marpa::Lexer.

This option must be used with the 'stt_file' key.

Data Files and Scripts

Overview of the Data Flow

The lexer and the parser work like this:

o Lexer input
o The State Transition Table (STT) file

The STT is stored outside the code (unlike the grammar for the cooked graph definition).

The current design ships the STT in 2 files, data/default.stt.ods and data/default.stt.csv.

*.ods is an Open Office Calc spreadsheet, and *.csv is a Comma-Separated Variable file.

This allows any user to change the STT as an experiment.

I work with the *.ods file, and export it to the *.csv file.

The program scripts/stt2html.pl converts the *.csv file to html for ease of display.

See new(stt_file => $stt_file_name, type => $stt_file_type) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o The raw Graph::Easy Graph Definition

A definition looks like '[node.1]{a:b;c:d}<->{e:f;}<=>{g:h}[node.2]{i:j}===[node.3]{k:l}'.

Node names are: node.1, node.2 and node.3.

Edge names are: <->, <=> and ===.

And yes, unlike the original Graph::Easy syntax, you can use a series of edges between 2 nodes, as with <-> and <=> above.

Nodes and edges can have attributes, very much like CSS. The attributes in this sample are meaningless, and are just to demonstrate the syntax.

The lexer can accept a graph definition in 2 ways:

See new(file => $graph_file_name) or new(graph => $graph_string) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Lexer processing

Call the lexer as my($result) = Graph::Easy::Marpa::Lexer -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Lexer output

The lexer writes a cooked graph definition to a file, using an intermediary language I invented just for this purpose.

The output file is in *.csv format. This file becomes input for the parser.

Of course, to exercise the parser, such files can be created manually.

See new(cooked => $csv_file_name) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Parser input
o The Grammar for the Cooked Graph Definition

The grammar is stored inside the code (unlike the STT).

This grammar is recognized by Marpa, which is the basis of the parser. See "grammar()" in Graph::Easy::Marpa::Parser.

o The Cooked Graph Definition

The *.csv file output by the lexer, or created manually, is the other input to the parser.

o Parser processing

Call the parser as my($result) = Graph::Easy::Marpa::Parser -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Parser output

After the parser runs successfully, the parser object holds a Set::Array object of tokens representing the graph.

An arrayref of items can be retrieved with the items() method in both the lexer and the parser.

The format of this array is documented below, in the "FAQ".

Later, a formatter will be written to position the tokens in space, for passing to a plotter just as 'dot'.

Data and Script Interaction

Sample input files for the lexer are in data/*.raw. Sample output files from the lexer, which are also input files for the parser, are in data/*.cooked.

o scripts/lex.pl and scripts/lex.sh

These use Graph::Easy::Marpa::Lexer.

They run the lexer on 1 *.raw input file, and produce an arrayref of items, and - optionally - 1 *.cooked output file.

Run scripts/lex.pl -h for samples of how to drive it.

Try:

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.04.raw -c data/node.04.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.05.raw -c data/node.05.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

You can use scripts/lex.sh to simplify this process:

        scripts/lex.sh data/node.05.raw data/node.05.cooked
        scripts/lex.sh data/graph.12.raw data/graph.12.cooked
o scripts/parse.pl and scripts/parser.sh

These use Graph::Easy::Marpa::Parser.

They run the parser on 1 *.cooked input file, and produce an arrayref of items.

Run scripts/parse.pl -h for samples of how to drive it.

Try:

        cat data/node.05.cooked
        perl -Ilib scripts/parse.pl -i data/node.05.cooked

You can use scripts/parse.sh to simplify this process:

        scripts/parse.sh data/node.05.cooked
        scripts/parse.sh data/graph.12.cooked
o scripts/gem.pl and scripts/gem.sh

This uses Graph::Easy::Marpa to combine calls to Graph::Easy::Marpa::Lexer and Graph::Easy::Marpa::Parser.

Run scripts/gem.pl -h for samples of how to drive it.

Try, using an environment variable for brevity:

        X=graph.13
        perl -Ilib scripts/gem.pl -i data/$X.raw -c $X.cooked -o $X.svg -p $X.items
        cat $X.cooked
        cat $X.items
        cat $X.svg

You can use scripts/gem.sh to simplify this process:

        X=graph.13
        scripts/gem.sh $X
        cat $X.cooked
        cat $X.items
        cat $X.svg

The Subset of Graph::Easy Graph Definitions Accepted by the Parser

Obviously, the STT in data/default.stt.ods and data/default.stt.csv defines precisely the currently acceptable syntax for graph definitions.

So, this section gives a more casual explanation.

o Attributes
o Attribute names

The attribute name must match /^[a-z]+$/.

o Attribute values

The attribute value is any string up to the next ';' or '}'.

Attribute values may be quoted with "..." or '...'. These quotes are stripped.

o Classes

Class + subclass names must match /^(edge|global|graph|group|node)(\.[a-z]+)?$/.

The name before the '.' is the class name.

'global' is used to specify whether you want a directed or undirected graph. The default is directed.

        global {directed: 1} [node.1] -> [node.2]

'graph' is used to specify the direction of the graph as a whole, and must be one of: LR or RL or TB or BT. The default is TB.

        graph {rankdir: LR} [node.1] -> [node.2]

The name after the '.' is the subclass name. And if '.' is present, the subclass name must be present. This means things like 'edge.' etc are syntax errors.

You use the subclass name in the attributes of an edge, a group or a node, whereas 'global' and 'graph' appear only once, at the start of the input stream.

        node {shape: square} node.forest {color: green}
        [node.1] -> [node.2] {class: forest} -> [node.3] {shape: circle; color: blue}

Here, node.1 gets the default shape, square, and node.2 gets both shape square and color green. node.3 gets shape circle and color blue.

As always, specific attributes override class attributes.

o Daisy-chains
o Edges

Edges must match /^(->|--)$/.

Edges can be daisy-chained by using a comma, ',', newline, space, or attributes, '{...}', to separate them.

Hence both of these are valid: '->,->{color:green}' and '->{color:red}->{color:green}'.

Edges can have attributes such as arrowhead, arrowtail, etc. See Graphviz

The edge is actually rendered, via the default renderer GraphViz2, by Graphviz.

Note: The syntax for edges is just a visual clue for the user. The directed 'v' undirected nature of the graph depends on the value of the 'directed' attribute present (explicitly or implicitly) in the input stream.

The default is {directed: 1}. See data/class.global.01.raw for a case where we use {directed: 0} attached to class 'global'.

o Groups

Groups can be daisy chained by juxtaposition, or by using a newline or space to separate them.

o Nodes

Nodes can be daisy chained by juxtaposition, or by using the comma, ',', newline, space, or attributes, '{...}', to separate them.

Hence all of these are valid: '[node.1][node.2]' and '[node.1],[node.2]' and '[node.1]{color:red}[node.2]'.

o Events

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /^[a-zA-Z_][a-zA-Z_0-9.]*$/.

o Groups

Group names must match /^[a-zA-Z_.][a-zA-Z_0-9. ]*$/.

o Nodes

Node names must match /^[a-zA-Z_0-9. ]+$/.

Since leading and trailing spaces are stripped, a single space can be used to represent the anonymous node.

o States

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /^[a-zA-Z_][a-zA-Z_0-9]*$/.

In the STT, this regexp applies to both the State name column ('C' in the spreadsheet data/default.stt.ods) and the Next state name column ('E').

Methods

cooked_file([$csv_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Lexer.

See also the parsed_tokens_file() method, below.

description([$graph_description_string])

Here, the [] indicate an optional parameter.

Get or set the string for the graph definition.

See also the input_file() method to read the graph from a file, below.

The value supplied to the description() method takes precedence over the value read from the input file.

dot_input_file([$file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file into which the rendering engine will write to input to dot (or whatever).

format([$format])

Here, the [] indicate an optional parameter.

Get or set the format of the output file, to be created by the renderer.

input_file([$graph_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to read the graph definition from.

See also the description() method.

The whole file is slurped in as 1 graph.

The first lines of the file can start with /^\s*#/, and will be discarded as comments.

The value supplied to the description() method takes precedence over the value read from the input file.

log($level, $s)

Calls $self -> logger -> $level($s).

logger([$logger_object])

Here, the [] indicate an optional parameter.

Get or set the logger object.

To disable logging, just set logger to the empty string.

This logger is passed to Graph::Easy::Marpa::Lexer, Graph::Easy::Marpa::Lexer::DFA, Graph::Easy::Marpa::Parser and Graph::Easy::Marpa::Renderer::GraphViz2.

maxlevel([$string])

Here, the [] indicate an optional parameter.

Get or set the value used by the logger object.

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

minlevel([$string])

Here, the [] indicate an optional parameter.

Get or set the value used by the logger object.

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

output_file([$output_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to which the renderer will write to resultant graph.

This is how the plotted graph is actually created.

If no renderer is supplied, or no output file is supplied, nothing is written.

parsed_tokens_file([$token_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Parser.

See also the cooked_file() method, above.

renderer([$rendering_object])

Here, the [] indicate an optional parameter.

Get or set the rendering object.

This is the object whose run() method will be called to render the result of parsing the cooked file received from Graph::Easy::Marpa::Lexer.

The format of the parameters passed to the renderer are documented in "run(%arg)" in Graph::Easy::Marpa::Renderer::GraphViz2, which is the default value for this object.

report_items([$Boolean])

Here, the [] indicate an optional parameter.

Get or set the flag to report, via the log, the items recognized in the cooked file.

Calls "report()" in Graph::Easy::Marpa::Parser to do the reporting.

stt_file([$stt_file_name])

The [] indicate an optional parameter.

Get or set the name of the file containing the state transition table.

This option is used in conjunction with the type() option.

timeout($seconds)

The [] indicate an optional parameter.

Get or set the timeout for how long to run the DFA.

type([$type])

The [] indicate an optional parameter.

Get or set the value which determines what type of stt_file is read.

FAQ

o What is the purpose of this set of modules?

It's the basis of a long-term project to formalize the way Graph::Easy processes its graph definitions, which in turn is meant to make on-going support for Graph::Easy much easier.

o What are Graph::Easy graphs?

You really should read the Graph::Easy docs.

In short, it means a text string containing a definition of a graph, using a cleverly designed language, that can be used to describe the sort of graph you wish to plot. Then, Graph::Easy does the plotting. Here is a sample.

o So what's a sample of a Graph::Easy graph definition?
        [node_1]{color: red; style: circle} -> {class: fancy;} [node_2]{color: green;}
o How are graphs stored in RAM by the lexer and the parser?

See "FAQ" in Graph::Easy::Marpa::Lexer.

o How are attributes assigned to nodes and edges?

Since the scan of the input stream is linear, any attribute detected belongs to the nearest preceeding node(s) or edge.

o How are attributes assigned to groups?

The only attributes which can be passed to a subgraph (group) are those that 'dot' accepts under the 'graph' part of a subgraph definition.

This means the attribute 'rank' cannot be passed, yet.

o Is there sample data I can examine?

See data/*.raw and the corresponding data/*.cooked and html/*.svg.

*.raw are input for the lexer, and *.cooked are output from the lexer.

Note: Some files contain deliberate mistakes. See above for instructions on running scripts/lex.pl and scripts/lex.sh.

o What about the fact the Graph::Easy can read various other definition formats?

I have no plans to support such formats. Nevertheless, having written these modules, it should be fairly easy to derive classes which perform that sort of work.

o What's with the regexp for class names in data/default.stt.ods?

We can't use \w+ because 'graph{a:b}' matches that under Perl 5.12.2.

o How to I re-generate the web page of demos?

By default, scripts/generate.index.pl outputs to File::Temp -> newdir(...). But by running it with a command line parameter, that value willl be used for the output directory.

o What are the defaults for GraphViz2, the default rendering engine?
         GraphViz2 -> new
         (
          edge    => $class{edge}   || {color => 'grey'},
          global  => $class{global} || {directed => 1},
          graph   => $class{graph}  || {rankdir => $self -> rankdir},
          logger  => $self -> logger,
          node    => $class{node} || {shape => 'oval'},
          verbose => 0,
         )

where $class($name) is taken from the class declarations at the start of the input stream.

TODO

Implement HTML-style labels

Use regexps from the STT to do more validation

At the moment, some validation is done in Graph::Easy::Marpa::Lexer::DFA by manually copying regexps from the STT to the subs validate_*().

Machine-Readable Change Log

The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Thanks

Many thanks are due to the people who worked on Graph::Easy.

Jeffrey Kegler wrote Marpa, and has been helping me via private emails.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=Graph::Easy::Marpa.

Author

Graph::Easy::Marpa was written by Ron Savage <ron@savage.net.au> in 2011.

Home page: http://savage.net.au/index.html.

Copyright

Australian copyright (c) 2011, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html