Name

Marpa::R2::Scanless::G - Scanless interface grammars

Synopsis

    my $grammar = Marpa::R2::Scanless::G->new(
        {
            source          => \(<<'END_OF_SOURCE'),
    :default ::= action => do_first_arg
    :start ::= Script
    Script ::= Expression+ separator => comma action => do_script
    comma ~ [,]
    Expression ::=
        Number
        | '(' Expression ')' action => do_parens assoc => group
       || Expression '**' Expression action => do_pow assoc => right
       || Expression '*' Expression action => do_multiply
        | Expression '/' Expression action => do_divide
       || Expression '+' Expression action => do_add
        | Expression '-' Expression action => do_subtract
    Number ~ [\d]+

    :discard ~ whitespace
    whitespace ~ [\s]+
    # allow comments
    :discard ~ <hash comment>
    <hash comment> ~ <terminated hash comment> | <unterminated
       final hash comment>
    <terminated hash comment> ~ '#' <hash comment body> <vertical space char>
    <unterminated final hash comment> ~ '#' <hash comment body>
    <hash comment body> ~ <hash comment char>*
    <vertical space char> ~ [\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
    <hash comment char> ~ [^\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
    END_OF_SOURCE
        }
    );

About this document

This page is the reference for the grammar objects of Marpa's Scanless interface.

Constructor

The new() method is the constructor for Scanless grammars. An example of its use is above. The new() constructor accepts a hash of named arguments. The following named arguments are allowed:

bless_package

Specifies the name of a Perl package. The package is used for blessing node values into a Perl class, in conjunction with the bless adverb. bless_package should not be confused with the SLIF's semantics_package recognizer setting. The two are not closely related.

source

The value of the source named argument must be a reference to a string which contains a description of the grammar. The string's format is a domain-specific language, described in its own document.

trace_file_handle

The value is a file handle. Trace output and warning messages go to the trace file handle. By default the trace file handle is STDERR.

Discouraged named arguments

action_object

Use of this argument is discouraged in favor of the semantics_package named argument of the SLIF recognizer. Like the semantics_package named argument, it sets the semantic package. Unlike the semantics_package named argument, it is a fatal error if used together with an explicit per-parse argument of the SLIF recognizer's value() method. It is also a fatal error to try to use the semantics_package and action_object arguments together.

default_action

Use of this argument is deprecated in favor of using the action adverb in a default pseudo-rule. Specifies the default_action named argument that will be used for the G1 grammar. For details of on possible default action values and how they are used, see action adverb.

Mutators

parse()

    my $grammar   = Marpa::R2::Scanless::G->new( { source => \$dsl } );
    my $input     = '42 * 1 + 7';
    my $value_ref = $grammar->parse( \$input, 'My_Actions' );

This very-high level method is a "one shot" way of producing a parse value from a grammar and an input stream. The features this method provides are those most often wanted in the "first cut" of a parser.

As the parser grows, users are likely to find their application has outgrown this method. It is recommended, rather than spend a lot of time exploring ways to adapt this method to expanding needs, that users be quick to abandon it in favor of the lower level calls. As an example of how to make this transition, the tutorial in Marpa::R2 is reimplemented using low-level calls in Marpa::R2::Tutorial2.

The parse() method takes one or more arguments. The first argument, which is required, is a ref to an input string. Optionally, the second argument may be a string specifying the package name for the semantics. The remaining arguments (including the second argument if it exists, but is not a string) must be references to hashes of named arguments. These hash references will be passed, as is, to the constructor for the recognizer.

This method returns a reference to the only parse value, if there is exactly one parse value. If there is no parse, or if the parse is ambiguous, parse() throws an exception.

set()

    $grammar->set( { trace_file_handle => $trace_fh } );

This method allows the named arguments to be changed after an SLIF grammar is created. Currently, the only argument that may be changed in trace_file_handle.

Accessors

rule_expand()

    my ($lhs_id, @rhs_ids) = $grammar->rule_expand($rule_id);
    $text .= "Rule #$rule_id: $lhs_id ::= " . (join q{ }, @rhs_ids) . "\n";
    my ($lhs_id, @rhs_ids) = $grammar->rule_expand($rule_id, 'L0');
    $text .= "L0 Rule #$rule_id: $lhs_id ::= " . (join q{ }, @rhs_ids) . "\n";

"Expands" a rule ID into symbol ID's. An array of symbol ID's is returned. The ID of the LHS symbol is the first element, and the remaining elements are the ID's of the RHS symbols, in order. Returns an empty array if the rule does not exist.

The first argument is the ID of the rule to be "expanded". The second, optional, argument is the name of a subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1.

rule_ids()

    do_something($_) for $grammar->rule_ids();
    do_something($_) for $grammar->rule_ids('L0');

Returns a list of the rule ID's as an array. Takes one, optional, argument: the name of a subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1.

rule_name()

    push @rule_names, $grammar->rule_name($_) for $grammar->rule_ids();

Given a rule ID, returns the rule name. A rule name is as defined by the name adverb. If no rule name was defined, the rule name is the name of the LHS symbol.

rule_show()

    my $rule_description = $grammar->rule_show($rule_id);
    my $rule_description = $grammar->rule_show($rule_id, 'L0');

For a rule ID, returns a string describing that rule in a form which is useful for tracing and debugging, but subject to change. Returns a Perl undef if the rule does not exist.

The first argument is the ID of the rule to be displayed. The second, optional, argument is the name of a subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1.

start_symbol_id()

    my $start_id = $grammar->start_symbol_id();

Returns the ID of the start symbol. Note that there is no method to return the ID of the start rule, because there may be no unique start rule.

symbol_description()

    my $description = $grammar->symbol_description($symbol_id)
        // '[No description]';
    $text .= "symbol number: $symbol_id  description $description\n";
    my $description = $grammar->symbol_description( $symbol_id, 'L0' )
        // '[No description]';
    $text .= "L0 symbol number: $symbol_id  description $description\n";

Given a symbol ID, returns a description of the symbol. The description may not be defined. Currently internal symbols tend to have descriptions, while symbols explicitly specified by the user in the DSL are treated as self-explanatory. The description is intended for humans to read, and is subject to change.

The first argument is the symbol ID. A second, optional, argument is the subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1. Returns a Perl undef if the symbol does not exist, or if it has no description.

symbol_display_form()

    my $display_form = $grammar->symbol_display_form($symbol_id);
    $text
        .= "symbol number: $symbol_id  name in display form: $display_form\n";
    my $display_form = $grammar->symbol_display_form( $symbol_id, 'L0' );
    $text
        .= "L0 symbol number: $symbol_id  name in display form: $display_form\n";

Given a symbol ID, returns the "display form" of the symbol. This is the symbol in a form thought most suitable for display in messages, etc. The display form is always defined. The display form of a symbol is not useable as a name -- it is not necessarily unique, and is subject to change.

The first argument is the symbol ID. A second, optional, argument is the subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1. Returns a Perl undef if the symbol does not exist.

symbol_dsl_form()

    my $dsl_form = $grammar->symbol_dsl_form($symbol_id)
        // '[No name in DSL form]';
    $text .= "symbol number: $symbol_id  DSL form: $dsl_form\n";
    my $dsl_form = $grammar->symbol_dsl_form( $symbol_id, 'L0' )
        // '[No name in DSL form]';
    $text .= "L0 symbol number: $symbol_id  DSL form: $dsl_form\n";

Given a symbol ID, returns the "DSL form" of the symbol. This is the name of the symbol in a form similar to the way it is specified by the user in the DSL. If the symbol has an explicit name, the symbol's DSL form is the same as its explicit name. If the symbol does not have an explicit name, the method may return a Perl undef, or it may return a DSL name invented by Marpa and intended to be suggestive. The DSL form of a symbol is not intended for use as a symbol name -- it is not necessarily unique, is not always defined, and it is subject to change.

The first argument is the symbol ID. A second, optional, argument is the subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1. Returns a Perl undef if the symbol does not exist, or if it has no DSL form.

symbol_ids()

    do_something($_) for $grammar->symbol_ids();
    do_something($_) for $grammar->symbol_ids('L0');

Returns a list of the symbol ID's as an array. Takes one, optional, argument: the name of a subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1.

symbol_name()

    my $name = $grammar->symbol_name($symbol_id);
    $text .= "symbol number: $symbol_id  name: $name\n";
    my $name = $grammar->symbol_name( $symbol_id, 'L0' );
    $text .= "L0 symbol number: $symbol_id  name: $name\n";

Given a symbol ID, returns the name of the symbol. For every symbol ID, this method's return value will be defined and will be unique to that symbol ID, so that it is suitable for use as a symbol name. If a symbol has an explicit name, the return value will be the symbol's explicit name. If there is no explicit name, it will be an internal name. Internal names are subject to change.

The first argument is the symbol ID. A second, optional, argument is the subgrammar. Currently there are L0 and G1 subgrammars. The default subgrammar is G1. Returns a Perl undef if the symbol does not exist.

Trace methods

show_rules()

    my $show_rules_output = $grammar->show_rules();
    $show_rules_output .= $grammar->show_rules(3, 'L0');

The show_rules() method returns a descripton of the rules for a subgrammar, by default G1. It is useful for understanding the rules as they appear in trace and debugging outputs. To allow for improvements in Marpa::R2, the output of show_rules() is subject to change.

The first optional argument can be a numeric verbosity level. The default verbosity is 1, which is adequate for most purposes. A verbosity of 2 prints additional information useful for those new to SLIF tracing and debugging. A verbosity of 3 prints additional information for experts.

The second, optional, argument is the name of a subgrammar. Currently there are L0 and G1 subgrammars.

show_symbols()

    $show_symbols_output .= $grammar->show_symbols(3);
    $show_symbols_output .= $grammar->show_symbols(3, 'L0');

The show_symbols() method returns a descripton of the symbols for a subgrammar, by default G1. It is useful for understanding the symbols as they appear in trace and debugging outputs. To allow for improvements in Marpa::R2, the output of show_symbols() is subject to change.

The first argument can be a numeric verbosity level. The default verbosity is 1, which is adequate for most purposes. A verbosity of 2 prints additional information useful for those new to SLIF tracing and debugging. A verbosity of 3 prints additional information for experts.

The second, optional, argument is the name of a subgrammar. Currently there are L0 and G1 subgrammars.

Discouraged methods

Discouraged methods are those that continue to be supported, but whose use is discouraged for one reason or another.

g0_rule()

    my @g0_rule_ids = $grammar->g0_rule_ids();
    for my $g0_rule_id (@g0_rule_ids) {
        $g0_rules_description .= "$g0_rule_id "
            . ( join q{ }, map {"<$_>"} $grammar->g0_rule($g0_rule_id) ) . "\n";
    }

Please prefer "rule_expand()", together with "symbol_name()" or "symbol_display_form()". Given a L0 rule ID as its argument, returns an array containing the names of the symbols of that rule. The g0_rule() method returns a Perl false if no L0 rule with that rule ID exists. If the L0 rule ID exists, g0_rule() returns a list of one or more symbol names. The first symbol name will be that of the rule's LHS symbol. The rest of the list will be the names of the rule's RHS symbols, in order.

g0_rule_ids()

    my @g0_rule_ids = $grammar->g0_rule_ids();
    for my $g0_rule_id (@g0_rule_ids) {
        $g0_rules_description .= "$g0_rule_id "
            . ( join q{ }, map {"<$_>"} $grammar->g0_rule($g0_rule_id) ) . "\n";
    }

Please prefer "rule_expand()". Returns a list of the L0 rule ID's.

g1_rule_ids()

    my @g1_rule_ids = $grammar->g1_rule_ids();
    for my $g1_rule_id (@g1_rule_ids) {
        $g1_rules_description .= "$g1_rule_id "
            . ( join q{ }, map {"<$_>"} $grammar->rule($g1_rule_id) ) . "\n";
    }

Please prefer "rule_expand()". Returns a list of the G1 rule ID's.

rule()

    my @g1_rule_ids = $grammar->g1_rule_ids();
    for my $g1_rule_id (@g1_rule_ids) {
        $g1_rules_description .= "$g1_rule_id "
            . ( join q{ }, map {"<$_>"} $grammar->rule($g1_rule_id) ) . "\n";
    }

Please prefer "rule_expand()", together with "symbol_name()" or "symbol_display_form()". Given a G1 rule ID as its argument, returns an array containing the names of the symbols of that rule. The rule() method returns a Perl false if no G1 rule with that rule ID exists. If the rule ID exists, rule() returns a list of one or more symbol names. The first symbol name will be that of the rule's LHS symbol. The rest of the list will be the names of the rule's RHS symbols, in order. The SLIF's rule() method is useful in combination with the SLIF's of the progress method, whose output identifies rules by rule ID.

Copyright and License

  Copyright 2022 Jeffrey Kegler
  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
  redistribute it and/or modify it under the terms of the GNU Lesser
  General Public License as published by the Free Software Foundation,
  either version 3 of the License, or (at your option) any later version.

  Marpa::R2 is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser
  General Public License along with Marpa::R2.  If not, see
  http://www.gnu.org/licenses/.