NAME

Graph::Easy::Marpa - A Marpa- and Set::FA::Element-based parser for Graph::Easy

Synopsis

Sample Code

        #!/usr/bin/env perl
        
        use strict;
        use warnings;
        
        use Graph::Easy::Marpa;
        
        use Getopt::Long;
        
        use Pod::Usage;
        
        # -----------------------------------------------
        
        my($option_parser) = Getopt::Long::Parser -> new();
        
        my(%option);
        
        if ($option_parser -> getoptions
        (
         \%option,
         'cooked_file=s',
         'description=s',
         'format=s',
         'help',
         'input_file=s',
         'logger=s',
         'maxlevel=s',
         'minlevel=s',
         'output_file=s',
         'parsed_tokens_file=s',
         'stt_file=s',
         'type=s',
        ) )
        {
                pod2usage(1) if ($option{'help'});
        
                # Return 0 for success and 1 for failure.
        
                exit Graph::Easy::Marpa -> new(%option) -> run;
        }
        else
        {
                pod2usage(2);
        }

This is shipped as scripts/gem.pl, although the shipped version has built-in help.

See also scripts/lex.pl and scripts/parse.pl.

Sample output

Unpack the distro and copy html/*.html and html/*.svg to your web server's doc root directory.

Then, point your browser at 127.0.0.1/index.html.

Or, hit http://savage.net.au/Perl-modules/html/graph.easy.marpa/index.html.

Modules

o Graph::Easy::Marpa

The current module, which documents the set of modules.

It uses Graph::Easy::Lexer, Graph::Easy::Parser and Graph::Easy::Marpa::GraphViz2 to render a Graph::Easy-syntax file into a (by default) *.svg file.

See scripts/gem.pl and scripts/gem.sh.

o Graph::Easy::Marpa::Lexer

See Graph::Easy::Marpa::Lexer.

Processes a raw Graph::Easy graph definition and outputs a cooked representation of that graph in a language which can be read by the parser.

See scripts/lex.pl and scripts/lex.sh.

o Graph::Easy::Marpa::Lexer::DFA

See Graph::Easy::Marpa::Lexer::DFA.

Wraps Set::FA::Element, which is what actually lexes the input Graph::Easy-syntax graph definition.

o Graph::Easy::Marpa::Parser

See Graph::Easy::Marpa::Parser.

Accepts a graph definition in the cooked language and builds a data structure representing the graph.

See scripts/parse.pl and scripts/parse.sh.

o Graph::Easy::Marpa::Utils

Code to help with testing.

Description

Graph::Easy::Marpa provides a Marpa-based parser for Graph::Easy-style graph definitions.

See "Data Files and Scripts" for details.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install Graph::Easy::Marpa as you would for any Perl module:

Run:

        cpanm Graph::Easy::Marpa

or run:

        sudo cpan Graph::Easy::Marpa

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = Graph::Easy::Marpa -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Graph::Easy::Marpa.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. maxlevel()]):

o cooked_file => $csv_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Lexer.

This file can be input to Graph::Easy::Marpa::Parser.

See also the 'parsed_tokens_file' key, below.

o description => $graph_description_string

Specify a string for the graph definition.

You are strongly encouraged to surround this string with '...' to protect it from your shell.

See also the 'input_file' key to read the graph from a file.

The 'description' key takes precedence over the 'input_file' key.

o dot_input_file => $file_name

Specify the name of a file that the rendering engine can write to, which will contain the input to dot (or whatever). This is good for debugging.

Default: ''.

If '', the file will not be created.

o format => $format_name

This is the format of the output file, to be created by the renderer.

Default is 'svg'.

o input_file => $graph_file_name

Read the graph definition from this file.

See also the 'description' key to read the graph from the command line.

The whole file is slurped in as 1 graph.

The first lines of the file can start with /^\s*#/, and will be discarded as comments.

The 'description' key takes precedence over the 'input_file' key.

o logger => $logger_object

Specify a logger object.

To disable logging, just set logger to the empty string.

The default value is an object of type Log::Handler which outputs to the screen.

This logger is passed to Graph::Easy::Marpa::Lexer, Graph::Easy::Marpa::Lexer::DFA, Graph::Easy::Marpa::Parser and Graph::Easy::Marpa::Renderer::GraphViz2.

o maxlevel => $level

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

The default 'maxlevel' is 'info'. A typical value is 'debug'.

o minlevel => $level

This option is only used if Graph::Easy::Marpa:::Lexer or Graph::Easy::Marpa::Parser create an object of type Log::Handler. See Log::Handler::Levels.

The default 'minlevel' is 'error'.

No lower levels are used.

o output_file => $output_file_name

If an output file name is supplied, and a rendering object is also supplied, then this call is made:

        $self -> renderer -> run(format => $self -> format, items => [$self -> items -> print], output_file => $file_name);

This is how the plotted graph is actually created.

o parsed_tokens_file => $token_file_name

This is the name of the file to write containing the tokens (items) output from Graph::Easy::Marpa::Parser.

Data Files and Scripts

Overview of the Data Flow

The lexer and the parser work like this:

o Lexer input

o The State Transition Table (STT) file

The STT is stored outside the code (unlike the grammar for the cooked graph definition).

The current design ships the STT in 2 files, data/default.stt.ods and data/default.stt.csv.

*.ods is an Open Office Calc spreadsheet, and *.csv is a Comma-Separated Variable file.

This allows any user to change the STT as an experiment.

I work with the *.ods file, and export it to the *.csv file.

The program scripts/stt2html.pl converts the *.csv file to html for ease of display.

See new(stt_file => $stt_file_name, type => $stt_file_type) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o The raw Graph::Easy Graph Definition

A definition looks like '[node.1]{a:b;c:d}<->{e:f;}<=>{g:h}[node.2]{i:j}===[node.3]{k:l}'.

Node names are: node.1, node.2 and node.3.

Edge names are: <->, <=> and ===.

And yes, unlike the original Graph::Easy syntax, you can use a series of edges between 2 nodes, as with <-> and <=> above.

Nodes and edges can have attributes, very much like CSS. The attributes in this sample are meaningless, and are just to demonstrate the syntax.

The lexer can accept a graph definition in 2 ways:

See new(file => $graph_file_name) or new(graph => $graph_string) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Lexer processing

Call the lexer as my($result) = Graph::Easy::Marpa::Lexer -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Lexer output

The lexer writes a cooked graph definition to a file, using an intermediary language I invented just for this purpose.

The output file is in *.csv format. This file becomes input for the parser.

Of course, to exercise the parser, such files can be created manually.

See new(cooked => $csv_file_name) in "Constructor_and_Initialization" in Graph::Easy::Marpa::Lexer for details.

o Parser input

o The Grammar for the Cooked Graph Definition

The grammar is stored inside the code (unlike the STT).

This grammar is recognized by Marpa, which is the basis of the parser. See "grammar()" in Graph::Easy::Marpa::Parser.

o The Cooked Graph Definition

The *.csv file output by the lexer, or created manually, is the other input to the parser.

o Parser processing

Call the parser as my($result) = Graph::Easy::Marpa::Parser -> new(%options) -> run.

run() returns 0 for success and 1 for failure.

run() dies with an error message upon error.

o Parser output

After the parser runs successfully, the parser object holds a Set::Array object of tokens representing the graph.

An arrayref of items can be retrieved with the items() method in both the lexer and the parser.

The format of this array is documented below, in the "FAQ".

Later, a formatter will be written to position the tokens in space, for passing to a plotter just as 'dot'.

Data and Script Interaction

Sample input files for the lexer are in data/*.raw. Sample output files from the lexer, which are also input files for the parser, are in data/*.cooked.

o scripts/lex.pl and scripts/lex.sh

These use Graph::Easy::Marpa::Lexer.

They run the lexer on 1 *.raw input file, and produce an arrayref of items, and - optionally - 1 *.cooked output file.

Run scripts/lex.pl -h for samples of how to drive it.

Try:

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.04.raw -c data/node.04.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

        perl -Ilib scripts/lex.pl -stt data/default.stt.csv -t csv -i data/node.05.raw -c data/node.05.cooked
        cat data/node.04.raw
        cat data/node.04.cooked

You can use scripts/lex.sh to simplify this process:

        scripts/lex.sh data/node.05.raw data/node.05.cooked
        scripts/lex.sh data/graph.12.raw data/graph.12.cooked

o scripts/parse.pl and scripts/parser.sh

These use Graph::Easy::Marpa::Parser.

They run the parser on 1 *.cooked input file, and produce an arrayref of items.

Run scripts/parse.pl -h for samples of how to drive it.

Try:

        cat data/node.05.cooked
        perl -Ilib scripts/parse.pl -i data/node.05.cooked

You can use scripts/parse.sh to simplify this process:

        scripts/parse.sh data/node.05.cooked
        scripts/parse.sh data/graph.12.cooked

o scripts/gem.pl and scripts/gem.sh

This uses Graph::Easy::Marpa to combine calls to Graph::Easy::Marpa::Lexer and Graph::Easy::Marpa::Parser.

Run scripts/gem.pl -h for samples of how to drive it.

Try, using an environment variable for brevity:

        X=graph.13
        perl -Ilib scripts/gem.pl -i data/$X.raw -c $X.cooked -o $X.svg -p $X.items
        cat $X.cooked
        cat $X.items
        cat $X.svg

You can use scripts/gem.sh to simplify this process:

        X=graph.13
        scripts/gem.sh $X
        cat $X.cooked
        cat $X.items
        cat $X.svg

The Subset of Graph::Easy Graph Definitions Accepted by the Parser

Obviously, the STT in data/default.stt.ods and data/default.stt.csv defines precisely the currently acceptable syntax for graph definitions.

So, this section gives a more casual explanation.

o Attributes

o Attribute names

The attribute name must match /^[a-z]+$/.

o Attribute values

The attribute value is any string up to the next ';' or '}'.

Attribute values may be quoted with "..." or '...'. These quotes are stripped.

o Classes

Class + subclass names must match /^(edge|global|graph|group|node)(\.[a-z]+)?$/.

The name before the '.' is the class name.

'global' is used to specify whether you want a directed or undirected graph. The default is directed.

        global {directed: 1} [node.1] -> [node.2]

'graph' is used to specify the direction of the graph as a whole, and must be one of: LR or RL or TB or BT. The default is TB.

        graph {rankdir: LR} [node.1] -> [node.2]

The name after the '.' is the subclass name. And if '.' is present, the subclass name must be present. This means things like 'edge.' etc are syntax errors.

You use the subclass name in the attributes of an edge, a group or a node, whereas 'global' and 'graph' appear only once, at the start of the input stream.

        node {shape: square} node.forest {color: green}
        [node.1] -> [node.2] {class: forest} -> [node.3] {shape: circle; color: blue}

Here, node.1 gets the default shape, square, and node.2 gets both shape square and color green. node.3 gets shape circle and color blue.

As always, specific attributes override class attributes.

o Daisy-chains

o Edges

Edges must match /^(->|--)$/.

Edges can be daisy-chained by using a comma, ',', newline, space, or attributes, '{...}', to separate them.

Hence both of these are valid: '->,->{color:green}' and '->{color:red}->{color:green}'.

Edges can have attributes such as arrowhead, arrowtail, etc. See Graphviz

The edge is actually rendered, via the default renderer GraphViz2, by Graphviz.

Note: The syntax for edges is just a visual clue for the user. The directed 'v' undirected nature of the graph depends on the value of the 'directed' attribute present (explicitly or implicitly) in the input stream.

The default is {directed: 1}. See data/class.global.01.raw for a case where we use {directed: 0} attached to class 'global'.

o Groups

Groups can be daisy chained by juxtaposition, or by using a newline or space to separate them.

o Nodes

Nodes can be daisy chained by juxtaposition, or by using the comma, ',', newline, space, or attributes, '{...}', to separate them.

Hence all of these are valid: '[node.1][node.2]' and '[node.1],[node.2]' and '[node.1]{color:red}[node.2]'.

o Events

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /^[a-zA-Z_][a-zA-Z_0-9.]*$/.

o Groups

Group names must match /^[a-zA-Z_.][a-zA-Z_0-9. ]*$/.

o Nodes

Node names must match /^[a-zA-Z_0-9. ]+$/.

Since leading and trailing spaces are stripped, a single space can be used to represent the anonymous node.

o States

These are part of the STT, but are not part of the Graph::Easy language.

Their names must match /^[a-zA-Z_][a-zA-Z_0-9]*$/.

In the STT, this regexp applies to both the State name column ('C' in the spreadsheet data/default.stt.ods) and the Next state name column ('E').

Methods