The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Graph::Easy::Marpa - A Marpa- and Set::FA::Element-based parser for Graph::Easy

Synopsis

Sample Code

        #!/usr/bin/env perl
        
        use strict;
        use warnings;
        
        use Graph::Easy::Marpa;
        
        use Getopt::Long;
        
        use Pod::Usage;
        
        # -----------------------------------------------
        
        my($option_parser) = Getopt::Long::Parser -> new();
        
        my(%option);
        
        if ($option_parser -> getoptions
        (
         \%option,
         'cooked_file=s',
         'description=s',
         'format=s',
         'help',
         'input_file=s',
         'maxlevel=s',
         'minlevel=s',
         'output_file=s',
         'token_file=s',
        ) )
        {
                pod2usage(1) if ($option{'help'});
        
                # Return 0 for success and 1 for failure.
        
                exit Graph::Easy::Marpa -> new(%option) -> run;
        }
        else
        {
                pod2usage(2);
        }

This is shipped as scripts/gem.pl, although the shipped version has help built in.

See also scripts/lex.pl and scripts/parse.pl.

Sample output

Unpack the distro and copy html/*.html and html/*.svg to your web server's doc root directory.

Then, point your browser at 127.0.0.1/index.html.

Or, hit http://savage.net.au/Perl-modules/html/graph.easy.marpa/index.html.

Modules

o Graph::Easy::Marpa

The current module, which documents the set of modules.

It uses Graph::Easy::Lexer, Graph::Easy::Parser and Graph::Easy::Marpa::GraphViz2 to render a Graph::Easy-syntax file into a (by default) *.svg file.

See scripts/gem.pl and scripts/gem.sh.

o Graph::Easy::Marpa::Lexer

See Graph::Easy::Marpa::Lexer.

Processes a raw Graph::Easy graph definition and outputs a cooked representation of that graph in a language which can be read by the parser.

See scripts/lex.pl and scripts/lex.sh.

o Graph::Easy::Marpa::Lexer::DFA

See Graph::Easy::Marpa::Lexer::DFA.

Wraps Set::FA::Element, which is what actually lexes the input Graph::Easy-syntax graph definition.

o Graph::Easy::Marpa::Parser

See Graph::Easy::Marpa::Parser.

Accepts a graph definition in the cooked language and builds a data structure representing the graph.

See scripts/parse.pl and scripts/parse.sh.

o Graph::Easy::Marpa::Utils

Code to help with testing.

Description

Graph::Easy::Marpa provides a Marpa-based parser for Graph::Easy-style graph definitions.

See "Data Files and Scripts" for details.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install Graph::Easy::Marpa as you would for any Perl module:

Run:

        cpanm Graph::Easy::Marpa

or run:

        sudo cpan Graph::Easy::Marpa

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = Graph::Easy::Marpa -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Graph::Easy::Marpa.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. maxlevel()]):

o cooked_file => $csv_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Lexer.

This file can be input to Graph::Easy::Marpa::Parser.

See also the token_file key, below.

o description => '[node.1]<->[node.2]'

Specify a string for the graph definition.

You are strongly encouraged to surround this string with '...' to protect it from your shell.

See also the 'input_file' key to read the graph from a file.

The 'description' key takes precedence over the 'input_file' key.

o format => $format_name

This is the format of the output file, to be created by the renderer.

Default is 'svg'.

o input_file => $graph_file_name

Read the graph definition from this file.

See also the 'graph' key to read the graph from the command line.

The whole file is slurped in as 1 graph.

The first lines of the file can start with /^\s*#/, and will be discarded as comments.

The 'description' key takes precedence over the 'input_file' key.

o maxlevel => $level

This option affects Log::Handler. See Log::Handler::Levels.

The default maxlevel is 'info'. A typical value is 'debug'.

o minlevel => $level

This option affects Log::Handler. See Log::Handler::Levels.

The default minlevel is 'error'.

No lower levels are used.

o output_file => $output_file_name

If an output file name is supplied, and a rendering object is also supplied, then this call is made:

        $self -> renderer -> run(format => $self -> format, items => [$self -> items -> print], output_file => $file_name);

This is how the plotted graph is actually created.

o renderer => $renderer_object

This is the object whose run() method will be called to render the result of parsing the cooked file received from Graph::Easy::Marpa::Lexer.

The format of the parameters passed to the renderer are documented in "run(%arg)" in Graph::Easy::Marpa::Renderer::GraphViz2, which is the default value for this object.

o report_items => $Boolean

Calls "report()" to report, via the log, the items recognized in the cooked file.

o token_file => $token_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Parser.

See also the cooked_file, above.

Data Files and Scripts

Overview of the Data Flow

The lexer and the parser work like this:

o Lexer input
o The State Transition Table (STT) file

The STT is stored outside the code (unlike the grammar for the cooked graph definition).

The current design ships the STT in 2 files, data/default.stt.ods and data/default.stt.csv.

*.ods is an Open Office Calc spreadsheet, and *.csv is a Comma-Separated Variable file.

This allows any user to change the STT as an experiment.

I work with the *.ods file, and export it to the *.csv file.

The program scripts/stt2html.pl converts the *.csv file to html for ease of display.

See new(stt_file => $stt_file_name, type => $stt_file_type) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o The raw Graph::Easy Graph Definition

A definition looks like '[node.1]{a:b;c:d}<->{e:f;}<=>{g:h}[node.2]{i:j}===[node.3]{k:l}'.

Node names are: node.1, node.2 and node.3.

Edge names are: <->, <=> and ===.

And yes, unlike the original Graph::Easy syntax, you can use a series of edges between 2 nodes, as with <-> and <=> above.

Nodes and edges can have attributes, very much like CSS. The attributes in this sample are meaningless, and are just to demonstrate the syntax.

The lexer can accept a graph definition in 2 ways:

See new(file => $graph_file_name) or new(graph => $graph_string) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Lexer processing

Call the lexer as my($result) = Graph::Easy::Marpa::Lexer -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Lexer output

The lexer writes a cooked graph definition to a file, using an intermediary language I invented just for this purpose.

The output file is in *.csv format. This file becomes input for the parser.

Of course, to exercise the parser, such files can be created manually.

See new(cooked => $csv_file_name) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Parser input
o The Grammar for the Cooked Graph Definition

The grammar is stored inside the code (unlike the STT).

This grammar is recognized by Marpa, which is the basis of the parser. See "grammar()" in Graph::Easy::Marpa::Parser.

o The Cooked Graph Definition

The *.csv file output by the lexer, or created manually, is the other input to the parser.

o Parser processing

Call the parser as my($result) = Graph::Easy::Marpa::Parser -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Parser output

After the parser runs successfully, the parser object holds a Set::Array object of tokens representing the graph.

An arrayref of items can be retrieved with the items() method in both the lexer and the parser.

The format of this array is documented below, in the "FAQ".

Later, a formatter will be written to position the tokens in space, for passing to a plotter just as 'dot'.

Data and Script Interaction

Sample input files for the lexer are in data/*.raw. Sample output files from the lexer, which are also input files for the parser, are in data/*.cooked.

o scripts/lex.pl and scripts/lex.sh

These use Graph::Easy::Marpa::Lexer.

They run the lexer on 1 *.raw input file, and produce an arrayref of items, and - optionally - 1 *.cooked output file.

Run scripts/lex.pl -h for samples of how to drive it.

Try:

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.04.raw -c data/node.04.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.05.raw -c data/node.05.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

You can use scripts/lex.sh to simplify this process:

        scripts/lex.sh data/node.05.raw data/node.05.cooked
        scripts/lex.sh data/graph.12.raw data/graph.12.cooked
o scripts/parse.pl and scripts/parser.sh

These use Graph::Easy::Marpa::Parser.

They run the parser on 1 *.cooked input file, and produce an arrayref of items.

Run scripts/parse.pl -h for samples of how to drive it.

Try:

        cat data/node.05.cooked
        perl -Ilib scripts/parse.pl -i data/node.05.cooked

You can use scripts/parse.sh to simplify this process:

        scripts/parse.sh data/node.05.cooked
        scripts/parse.sh data/graph.12.cooked
o scripts/gem.pl and scripts/gem.sh

This uses Graph::Easy::Marpa to combine calls to Graph::Easy::Marpa::Lexer and Graph::Easy::Marpa::Parser.

Run scripts/gem.pl -h for samples of how to drive it.

Try, using an environment variable for brevity:

        X=graph.13
        perl -Ilib scripts/gem.pl -i data/$X.raw -c $X.cooked -t $X.items -o $X.svg
        cat $X.cooked
        cat $X.items
        cat $X.svg

You can use scripts/gem.sh to simplify this process:

        X=graph.13
        scripts/gem.sh $X
        cat $X.cooked
        cat $X.items
        cat $X.svg

The Subset of Graph::Easy Graph Definitions Accepted by the Parser

Obviously, the STT in data/default.stt.ods and data/default.stt.csv defines precisely the currently acceptable syntax for graph definitions.

So, this section gives a more casual explanation.

o Attributes
o Attribute names

The attribute name must match /[a-z][a-z0-9_]*/.

o Attribute values

The attribute value must match /^["']?[^"';\s]["']?$/.

o Classes

Class names must match /^(?:edge|graph|group|node|[a-z][a-z0-9_]*)$/.

o Daisy-chains
o Edges

Edges can be daisy-chained by using a comma, ',', space, or attributes, '{...}', to separate them.

Hence both of these are valid: '<->,<=>{color:green}' and '<->{color:red}<=>{color:green}'.

o Nodes

Nodes can be daisy chained by juxtapostion, or by using the comma, ',', space, or attributes, '{...}', to separate them.

Hence all of these are valid: '[node.1][node.2]' and '[node.1],[node.2]' and '[node.1]{color:red}[node.2]'.

o Edges

Specifically, edges must currently match this regexp: /^<?(-|=|\.|~|- |= |\.-|\.\.-){1,}>?$/, which I've gleaned from the Graph::Easy docs at Edges.

o Events

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /[a-zA-Z_][a-zA-Z_0-9.]*/.

o Nodes

Node names must match /[a-zA-Z_][a-zA-Z_0-9. ]*/.

o States

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /[a-zA-Z_][a-zA-Z_0-9.]*/.

In the STT, this regexp applies to both the State name column ('C') and the Next state name column ('E').

o Subclass names

Subclass names must start with a class name, and must be separated from the class name with a fullstop, '.'.

The 4 special class names are listed above.

The subclass name itself must match /[a-z][a-z0-9_]*/.

Hence these are both valid: 'node.city' and 'edge.railway'.

FAQ

o What is the purpose of this set of modules?

It's the basis of a long-term project to formalize the way Graph::Easy processes its graph definitions, which in turn is meant to make on-going support for Graph::Easy much easier.

o What are Graph::Easy graphs?

You really should read the Graph::Easy docs.

In short, it means a text string containing a definition of a graph, using a cleverly designed language, that can be used to describe the sort of graph you wish to plot. Then, Graph::Easy does the plotting. Here is a sample.

o So what's a sample of a Graph::Easy graph definition?
        [node_1]{color:red;style:circle}=>{class:fancy;}[node_2]{color:green;}
o How are graphs stored in RAM (by Graph::Easy::Marpa::Parser)?

As an arrayref of hashrefs, where each hashref records information about one 'item' in the input stream.

Items are:

o Nodes

A node definition of '[N]' would produce a hashref of:

        {
        count => $n,
        name  => 'N',
        type  => 'node',
        value => '',
        }

A node can have a definition of '[]', which means it has no name. Such node are anonymous, and are called invisible because while they take up space in the output stream, they have no printable or visible characters in the output stream. See Graph::Easy for details.

Node names are case-sensitive, and must be unique (except for anonymous nodes).

Note: Items are numbered from 1 up, but some numbers are missing, since those values are used internally.

o Edges

An edge definition of '->' would produce a hashref of:

        {
        count => N,
        name  => '->',
        type  => 'edge',
        value => '',
        }
o Attributes

An attribute can belong to a node or an edge. An attribute definition of '{color: red;}' would produce a hashref of:

        {
        count => N,
        name  => 'color',
        type  => 'attr',
        value => 'red',
        }

An attribute definition of '{color: red;shape: circle;}' will produce 2 hashrefs, i.e. 2 elements in the arrayref:

        {
        count => N,
        name  => 'color',
        type  => 'attr',
        value => 'red',
        }

        {
        count => N,
        name  => 'shape',
        type  => 'attr',
        value => 'circle',
        }
o Classes

There are various special items in the arrayref of items, all placeholders.

They contain the default values of attributes you wish to assign to every item of a particular type.

They do not take up any place in the output stream, and you declare them in the input stream, and assign attributes to them, in the same way you assign attributes to any node or edge.

They are not the same as the anonymous nodes mentioned above.

Classes can have subclasses.

o The graph class

This class's name is 'graph', and it looks like:

        {
        count => N,
        name  => 'graph',
        type  => 'class',
        value => '',
        }

The attributes attached to this class apply to the graph as a whole.

o The group class

A group is a name given to a set of nodes.

This class's name is 'group', and it looks like:

        {
        count => N,
        name  => 'group',
        type  => 'class',
        value => '',
        }

The attributes attached to this class apply to each group.

o The node class

This node's name is 'node', and it looks like:

        {
        count => N,
        name  => 'node',
        type  => 'class',
        value => '',
        }

The attributes attached to this class apply to each node.

Attributes of a subclass of node, or belonging to a node, override the node class's attributes.

A subclass name for a node must start with 'node.', as in 'node.city'.

o The edge class

This class's name is 'edge', and it looks like:

        {
        count => N,
        name  => 'edge',
        type  => 'class',
        value => '',
        }

The attributes attached to this class apply to each edge.

Attributes of a subclass of edge, or belonging to an edge, override the edge class's attributes.

A subclass name for an edge must start with 'edge.', as in 'edge.road'.

o The daisy-chain item

This item indicates the graph definition contained 2 adjacent nodes, as in '[node.1],[node.2]', which means any following attributes must be assigned to all nodes in the daisy-chain.

It looks like:

        {
        count => N,
        name  => ',',
        type  => 'daisy_chain',
        value => '',
        }
o How are attributes assigned to nodes and edges?

Since the scan of the input stream is linear, any attribute detected belongs to the nearest preceeding node(s) or edge.

o Is there sample data I can examine?

See data/*.raw and the corresponding data/*.csv.

*.raw are input for the lexer, and *.csv are output from the lexer.

Note: Some files contain deliberate mistakes. See above for instructions on running scripts/lex.pl and scripts/lex.sh.

o What about the fact the Graph::Easy can read various other definition formats?

I have no plans to support such formats. Nevertheless, having written these modules, it should be fairly easy to derive classes which perform that sort of work.

o What's with the regexp for class names in data/default.stt.ods?

We can't use \w+ because 'graph{a:b}' matches that under Perl 5.12.2.

o How to I re-generate the web page of demos?

Patch scripts/generate.index.pl to use a permanent directory instead of calling File::Temp -> newdir(...). Then, run it.

TODO

Pass lexed graph definition from lexer to parser via RAM, not just via a disk file

This saves a bit of time, since then the lexer would not have to write a file, and the parser would not have to read one.

Add an anonymous node at the end of the input, if necessary

This counteracts a buglet in the renderer, in that it doesn't plot edges unless followed by a node. So, a graph with a trailing edge will not show that edge.

Proof-read all docs, specifically checking parameters for scripts and methods

Ensure docs for arrayref of items passed to the renderer is up-to-date.

Add a timeout on the lexer, not just on calling 'dot'

Use regexps from the STT to do more validation

Document difference in syntax compared with Graph::Easy

Not in detail, probably.

Implement classes and subclasses, and groups

Implement user-control of arrow shape

Implement user-control over the selection of rendering engine

This means the parser needs a graph type option, too.

Put data/default.stt.csv into the source code of the lexer

This saves a bit of time, since then the lexer would not have to read a separate file.

Allow logger objects to be provided in more places

Hence sub _init() would then read:

        $arg{logger} = defined($arg{logger}) ? $arg{logger} : Log::Handler -> new.

Then, the caller can use logger => '' to turn off logging.

Machine-Readable Change Log

The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Thanks

Many thanks are due to the people who worked on Graph::Easy.

Jeffrey Kegler wrote Marpa, and has been helping me via private emails.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=Graph::Easy::Marpa.

Author

Graph::Easy::Marpa was written by Ron Savage <ron@savage.net.au> in 2011.

Home page: http://savage.net.au/index.html.

Copyright

Australian copyright (c) 2011, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html