Marpa::Doc::Plumbing - The Plumbing Interface
This document describes Marpa's plumbing Interface. The plumbing is the low-level interface used by all the porcelain interfaces. The plumbing can be used directly. It is a short list of named arguments to the Marpa::Grammar::new(), Marpa::Grammar::set(), and Marpa::Recognizer::new() methods.
Marpa::Grammar::new()
Marpa::Grammar::set()
Marpa::Recognizer::new()
The start argument may be used in combination with a porcelain interface, subject to the symbol name conversion requirements described below. Other than that, plumbing and porcelain interfaces cannot be used to build the same grammar. Marpa throws an exception if the user attempts to use any of the plumbing's other named arguments with a porcelain interface.
start
Each interface has its own rules for symbol names. The plumbing's conventions are designed to allow flexibility for the porcelain. Any valid Perl string not ending in a right square bracket is an acceptable plumbing symbol name. Plumbing symbol names which end in right square brackets are reserved for Marpa internal use.
Unlike MDL, plumbing symbols are not considered identical unless their names match exactly. Unless stated otherwise, any reference to a symbol name in this document means a plumbing symbol name.
my $a = $grammar->get_symbol('a');
Given a symbol's plumbing name, returns the symbol's cookie. It returns undefined if a symbol with that name doesn't exist. If you are using MDL to define your grammar, you want to use Marpa::MDL::get_symbol instead.
Marpa::MDL::get_symbol
Symbol cookies are used primarily in calls to the Marpa::Recognizer::earleme method. To get the cookie for a symbol using its porcelain name, see the documentation for the individual porcelain interface.
Marpa::Recognizer::earleme
rules
The rules named argument is available with both the Marpa::Grammar::new and Marpa::Grammar::set methods. The rules named argument may be specified multiple times, adding new rules to the grammar each time. New rules may be added until the grammar is precomputed.
Marpa::Grammar::new
Marpa::Grammar::set
The value of the rules named argument must be a reference to an array, and each element of the array must be a reference to a description of a rule. Rule descriptions can be either arrays (the short form) or hashes (the long form).
The short form description of a rule is an array with 4 elements: lhs, rhs, action and priority. The last two of these are optional.
The lhs element must be the name of the left hand side symbol. The rhs element must be a reference to an array of names of right hand side symbol names. In the case of an empty rule, rhs must be a reference to a zero length array.
The action element, if present, must be a string describing the rule's action in the current Marpa semantics. Right now, the only available semantics is Perl 5. If the action for a rule is not explicitly set, it will be the value of Marpa's default_action option.
default_action
The priority element, if present, must be an integer. It can be negative. It will be the priority of the rule. If undefined, priority defaults to zero.
The long form description of a rule is a hash of rule options, with the option names as the hash keys, and the option values as the hash values. The available rule options are:
lhs
rhs
action
priority
The values of the lhs, rhs, action, and priority rule options are as described above for the corresponding elements of the short form.
min
min must be undefined, 0 or 1. If min is 0 or 1, the rule is a sequence production. If min is undefined, the rule is an ordinary, BNF production.
Only one symbol is allowed on the right hand side of a sequence production, and the right hand side symbol may not be a nullable symbol. The input will be required to match the rhs symbol at least min times and will be allowed to match an unlimited number of times. For an introduction to sequence productions, see the MDL document.
separator
Any sequence production may have a separator defined. The value must be a symbol name. Marpa allows trailing separators, Perl style. The separator must not be a nullable symbol.
Marpa throws an exception if a duplicate rule is added. For BNF productions, a rule is considered a duplicate if it has the same left hand side symbol, and the same symbols in the same order on the right hand side. For sequences, a rule is considered a duplicate if it has the same left hand symbol, the same right hand side symbol, and the same separator.
terminals
The value of the terminals name argument must be a reference to an array of terminal descriptions. Terminal descriptions can be short form or long form. The short form is very short: it is the symbol name of the terminal as a scalar string.
A long form terminal description is a reference to an array of two elements. The first element is the symbol name of the terminal. The second element must be a reference to a hash of terminal options, with option names as hash keys and option values as hash values.
regex
The value of the regex terminal option must be a regular expression. It is used when Marpa is asked to match the terminals in the input text. When the tokens are supplied directly, for example when using the earleme command, the terminal's regex value is ignored. Only one of the regex and action terminal options may be specified. See the MDL document for details on writing terminal regexes.
The value of the action terminal option must be a string with code in the current semantics. Right now the only available semantics is Perl 5. The code will be interpreted as a lex action, which will be used to match the terminal in the input text. When the tokens are supplied directly, for example when using the earleme command, the terminal's action value is ignored. Only one of the regex and action terminal options may be specified. See the MDL document for details on writing lex actions.
prefix
The value of the prefix terminal option must be a regular expression. It will be used to match and discard text from the input before any attempt is made to match the terminal itself. The most common use is to discard leading whitespace. When the tokens are supplied directly, for example when using the earleme command, the terminal's prefix value is ignored.
The value of the priority terminal option must be an integer. It can be negative. It will control the order in which terminal matches are attempted.
The value of the start named argument must be a plumbing symbol name. It will be used as the start symbol for the grammar. Most of the plumbing named arguments may not be used in combination with a porcelain interface. The start named argument is an exception. It may be used to set the default for, or to override the choice of, the start symbol in the porcelain.
If you use the start named argument to specify a porcelain symbol, you must be careful to use the plumbing symbol name. The documentation for the porcelain should describe how to convert porcelain symbol names to plumbing symbol names.
See the support section in the main module.
Jeffrey Kegler
Copyright 2007 - 2009 Jeffrey Kegler
This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0.
To install Marpa, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Marpa
CPAN shell
perl -MCPAN -e shell install Marpa
For more information on module installation, please visit the detailed CPAN module installation guide.