The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MarpaX::Grammar::Parser - Converts a Marpa grammar into a tree using Tree::DAG_Node

Synopsis

        use MarpaX::Grammar::Parser;

        my(%option) =
        (
                marpa_bnf_file => 'data/metag.bnf',   # Input.
                raw_tree_file  => 'data/my.raw.tree', # Output.
                user_bnf_file  => 'data/my.bnf',      # Input.
        );

        my($parser) = MarpaX::Grammar::Parser -> new(%option);

        $parser -> run;

        print map{"$_\n"} @{$parser -> raw_tree -> tree2string({no_attributes => 1})};

See data/metag.bnf for the BNF file which ships with Marpa::R2 V 2.066000.

See data/*.bnf for input files and data/*.tree for output files.

For help, run

        shell> perl -Ilib scripts/g2p.pl -h

Description

Installation

Install MarpaX::Grammar::Parser as you would for any Perl module:

Run:

        cpanm MarpaX::Grammar::Parser

or run:

        sudo cpan MarpaX::Grammar::Parser

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = MarpaX::Grammar::Parser -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type MarpaX::Grammar::Parser.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. marpa_bnf_file([$string])]):

o logger aLog::HandlerObject

By default, an object of type Log::Handler is created which prints to STDOUT, but in this version nothing is actually printed.

See maxlevel and minlevel below.

Set logger to '' (the empty string) to stop a logger being created.

Default: undef.

o marpa_bnf_file aMarpaBNFFileName

Specify the name of Marpa's own BNF file. This file ships with Marpa::R2, in the meta/ directory. It's name is metag.bnf.

A copy, as of Marpa::R2 V 2.066000, ships with MarpaX::Grammar::Parser. See data/metag.bnf.

This option is mandatory.

Default: ''.

o maxlevel logOption1

This option affects Log::Handler objects.

See the Log::Handler::Levels docs.

Default: 'info'.

o minlevel logOption2

This option affects Log::Handler object.

See the Log::Handler::Levels docs.

Default: 'error'.

No lower levels are used.

o no_attributes Boolean

Include (0) or exclude (1) attributes in the raw_tree_file output.

Default: 0.

o raw_tree_file aTextFileName

The name of the text file to write containing the grammar as a raw tree.

If '', the file is not written.

Default: ''.

o user_bnf_file aUserGrammarFileName

Specify the name of the file containing your Marpa::R2-style grammar.

See data/stringparser.bnf for a sample.

This option is mandatory.

Default: ''.

Installing the module

Install MarpaX::Grammar::Parser as you would for any Perl module:

Run:

        cpanm MarpaX::Grammar::Parser

or run:

        sudo cpan MarpaX::Grammar::Parser

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake)
        make test
        make install

Methods

log($level, $s)

Calls $self -> logger -> log($level => $s) if ($self -> logger).

logger([$logger_object])

Here, the [] indicate an optional parameter.

Get or set the logger object.

To disable logging, just set logger to the empty string.

Note: logger is a parameter to new().

marpa_bnf_file([$bnf_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to read Marpa's grammar's BNF from. The whole file is slurped in as a single string.

The parameter is mandatory.

This file ships with Marpa::R2, in the meta/ directory. It's name is metag.bnf.

A copy, as of Marpa::R2 V 2.066000, ships with MarpaX::Grammar::Parser.

See data/metag.bnf for a sample.

Note: marpa_bnf_file is a parameter to new().

maxlevel([$string])

Here, the [] indicate an optional parameter.

Get or set the value used by the logger object.

This option is only used if an object of type Log::Handler is created. See Log::Handler::Levels.

Note: maxlevel is a parameter to new().

minlevel([$string])

Here, the [] indicate an optional parameter.

Get or set the value used by the logger object.

This option is only used if an object of type Log::Handler is created. See Log::Handler::Levels.

Note: minlevel is a parameter to new().

no_attributes([$Boolean])

Here, the [] indicate an optional parameter.

Get or set the option which includes (0) or excludes (1) node attributes from being included in the output raw_tree_file.

Note: no_attributes is a parameter to new().

raw_tree()

Returns the root node, of type Tree::DAG_Node, of the raw tree of items in the user's BNF.

By raw tree, I mean as derived directly from Marpa. Later, a cooked_tree() method will be provided, for a compressed version of the tree.

The raw tree is optionally written to the file name given by "raw_tree_file([$output_file_name])".

raw_tree_file([$output_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to which the raw tree form of the user's grammar will be written.

If no output file is supplied, nothing is written.

See data/stringparser.tree for the output of parsing data/stringparser.bnf.

This latter file is the grammar used in Marpa::Demo::StringParser.

Note: raw_tree_file is a parameter to new().

user_bnf_file([$bnf_file_name])

Here, the [] indicate an optional parameter.

Get or set the name of the file to read the user's grammar's BNF from. The whole file is slurped in as a single string.

The parameter is mandatory.

See data/stringparser.bnf for a sample. It is the grammar used in MarpaX::Demo::StringParser.

Note: user_bnf_file is a parameter to new().

Files Shipped with this Module

o data/c.ast.bnf

This is part of MarpaX::Languages::C::AST, by Peter Stuifzand. It's 1,565 lines long.

The output is data/c.ast.tree.

o data/c.ast.tree

This is the output from parsing data/c.ast.bnf. It's 56,723 lines long, which indicates the complexity of Peter's grammar for C.

The command to generate this file is:

        shell> scripts/g2p.sh c.ast
o data/json.1.bnf

It is part of MarpaX::Demo::JSONParser, written as a gist by Peter Stuifzand.

See https://gist.github.com/pstuifzand/4447349.

The output is data/json.1.tree.

o data/json.1.tree

This is the output from parsing data/json.1.bnf.

The command to generate this file is:

        shell> scripts/g2p.sh json.1
o data/json.2.bnf

It also is part of MarpaX::Demo::JSONParser, written by Jeffrey Kegler as a reply to the gist above from Peter.

The output is data/json.2.tree.

o data/json.2.tree

This is the output from parsing data/json.2.bnf.

The command to generate this file is:

        shell> scripts/g2p.sh json.2
o data/metag.bnf.

This is a copy of Marpa::R2's BNF.

See "marpa_bnf_file([$bnf_file_name])" above.

o data/stringparser.bnf.

This is a copy of MarpaX::Demo::StringParser's BNF.

The output is data/stringparser.tree.

See "user_bnf_file([$bnf_file_name])" above.

o data/stringparser.tree

This is the output from parsing data/stringparser.bnf.

The command to generate this file is:

        shell> scripts/g2p.sh stringparser

See also the next item.

o data/stringparser.treedumper

This is the output of running:

        shell> perl scripts/metag.pl data/metag.bnf data/stringparser.bnf > data/stringparser.treedumper

That script, metag.pl, is discussed just below, and in the "FAQ".

o scripts/g2p.pl

This is a neat way of using the module. For help, run:

        shell> perl -Ilib scripts/g2p.pl -h

Of course you are also encouraged to include this module directly in your own code.

o scripts/g2p.sh

This is a quick way for me to run g2p.pl.

o scripts/metag.pl

This is Jeffrey Kegler's code. See the first FAQ question.

o scripts/pod2html.sh

This lets me quickly proof-read edits to the docs.

FAQ

What are the attributes and name of each node in tree?

o Attributes
o level

This is the level in the tree of the 'current' node.

The root of the tree is level 0. All other nodes have the value of $level + 1, where $level (starting from 0) is determined by Data::TreeDumper.

o type

This indicates what type of node it is. Values:

o Grammar

'Grammar' means the node's name is an item from the user-specified grammar.

o Marpa

'Marpa' means that Marpa has assigned a class to the node, of the form:

        $class_name::$node_name

See data/stringparser.treedumper, which will make this much clearer.

$class_name is a constant provided by this module, and is 'MarpaX::Grammar::Parser::Dummy'.

o Name

This is either an item from the user-specified grammar (when the attribute type is 'Grammar') or a Marpa-internal token (when the attribute type is 'Marpa').

Where did the basic code come from?

Jeffrey Kegler wrote it, and posted it on the Google Group dedicated to Marpa, on 2013-07-22, in the thread 'Low-hanging fruit'. I modified it slightly for a module context.

The original code is shipped as scripts/metag.pl.

As you can see he uses a different way of reading the files, one which avoids loading a separate module. I've standardized on Perl6::Slurp, especially when I want utf8, and File::Slurp when I want to read a directory. Of course I try not to use both in the same module.

Why did you use Data::TreeDump?

It offered the output which was most easily parsed of the modules I tested. The others were Data::Dumper, Data::TreeDraw, Data::TreeDumper and Data::Printer.

Why are some options/methods called raw_*?

See "ToDo" below for details.

Where is Marpa's Homepage?

http://jeffreykegler.github.io/Ocean-of-Awareness-blog/.

Are there any articles discussing Marpa?

Yes, many by its author, and several others. See Marpa's homepage, just above, and:

The Marpa Guide, (in progress, by Peter Stuifzand and Ron Savage).

Parsing a here doc, by Peter Stuifzand.

An update of parsing here docs, by Peter Stuifzand.

Conditional preservation of whitespace, by Ron Savage.

See Also

Marpa::Demo::JSONParser.

Marpa::Demo::StringParser.

MarpaX::Languages::C::AST.

Data::TreeDumper.

Log::Handler.

ToDo

o Compress the tree
o Horizontal compression

At the moment, the first 2 children of each 'class' type node are the offset and length within the input stream where the parser found each token. I want to move those into the attributes of the 3rd node, and hence remove those 2 nodes at each level of the tree.

See data/stringparser.tree.

o Vertical compression

The tree contains many nodes which are artifacts of Marpa's processing method. I want to remove any nodes which do not refer directly to items in the user's grammar.

Together this will mean the remaining nodes can be used without further modification as input to my other module Marpa::Grammar::GraphViz2. The latter is on hold until I can effect these compressions, so don't be surprized if that link fails.

When this work is done, there will be 2 new attributes in this module, cooked_tree() to return the root of the compressed tree, and cooked_tree_file(), which will name the file to use to save the new tree to disk.

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=MarpaX::Grammar::Parser.

Author

MarpaX::Grammar::Parser was written by Ron Savage <ron@savage.net.au> in 2013.

Home page: http://savage.net.au/.

Copyright

Australian copyright (c) 2013, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License 2.0, a copy of which is available at:
        http://www.opensource.org/licenses/index.html