The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Marpa::R3::Changes - Differences between Marpa::R2 and Marpa::R3

About this document

This document describes the significant incompatibilities between Marpa::R2 and Marpa::R3. It is intended for readers already familiar with Marpa::R2, who are writing new applications for Marpa::R3, and for readers migrating Marpa::R2 applications and tools to Marpa::R3.

Differences that do not give rise to a significant incompatibility are not included. Here "significant" means "likely to impact legacy Marpa::R2 code".

This document is a checklist and, on its own, is not a complete guide for migration. It avoids duplicating material in the main Marpa::R3 documents. For example, if a Marpa::R2 method is replaced by a Marpa::R3 method, this document may simply note that fact and refer the reader to the description of the new method in the other documents.

Marpa::R3 contains so many highly inter-connected changes, that nothing in it can safely be said to be simply "unchanged". Some methods and named arguments are described as "Mostly unchanged". Here "mostly" is used in the same sense in which Douglas Adams uses in the Hitchhiker's Guide, which describes the Earth as "mostly harmless" -- it means "for most user's in typical circumstances".

Events

In Marpa::R3, events have been changed from an event-driven mechanism to a callback mechanism. A new events_handlers named argument has been added to Marpa::R3::Recognizer::new(). The $slr->events() method has been removed.

New valuer object

There is a new valuer object, with its own new POD document.

Marpa strings must be UTF-8

Marpa expects all strings passed to it to be valid UTF-8. (Note that all ASCII-7 string are valid UTF-8).

"Eager" lexemes added

See "eager" in Marpa::R3::DSL.

Problems with kernel Earley items in progress reports fixed

Progress reports had been misreporting some Earley items. The misreported items were certain kernel Earley items instances containing one or more proper nullables. By kernel Earley item, I mean an Earley item where the dot is in the middle -- not at the beginning and and not at the end. In other words, kernel Earley items are items which are not predictions, and which are not completions.

This bug is actually quite obscure -- almost all interest in progress reports is in completed Earley items, which were not affected. So obscure, in fact, was this bug that it went unnoticed in use. It surfaced only when I reread the code and realized that some corner cases were not being dealt with correctly. This bugs is now fixed.

Semantics moved from recognizer to grammar

In Marpa::R2, the semantics was not finally settled until the $recce->value() call. In Marpa::R3, semantics will be fully settled in the grammar.

Actions that are Perl names must resolve to subroutines

It was a little noticed feature of Marpa::R2, that actions specified as Perl names (like "My_Action::doit") could resolve to scalars. In Marpa::R3 they must resolve to Perl subroutines.

The Stuifzand interface (PSIF) has been removed

The Stuifzand interface (PSIF), and its documentation, have been removed. Important in the development of Marpa, it now has little or no usage.

The Thin interface (THIF) has been removed.

The THIF was a "thin" Perl interface. It has been removed.

The NAIF has been removed

The NAIF was an older interface using hashes of named variables, instead of a DSL. It has been removed.

LATM is now the default

[name, values] is now the default action

Unicode now works in the SLIF DSL

Context::location is now Context::g1_range

New context variable, Context::g1_span

Rule and symbol accessors are completely different

In Marpa::R3, symbols and rules were divided in external and internal. As a result, the grammar accessors for symbols and rules changed completely between Marpa::R2 and Marpa::R3. The changes are so massive that any summary of the changes to the grammar accessors would be essentially be a repetition of their documentation.

The semantic closure now always receives exactly 2 arguments

Under Marpa::R2, the semantic closure received a varying number of arguments, depending on circumstances. Under Marpa::R3, the semantic closure always receives exactly 2 arguments. The first argument is the per-parse object. The second argument is a reference to an array containing the values of the child nodes, in lexical order. If there were no child nodes visible to the semantics, then the second argument is an empty array.

Marpa::R2 Grammar named arguments, alphabetically

This section accounts for each of the Marpa::R2 grammar's named arguments, in alphabetical order.

bless_package

Mostly unchanged.

ranking_method

Formerly recognizer named argument. high_rule_only option renamed to high_rank_only.

source

Mostly unchanged.

trace_file_handle

Mostly unchanged.

action_object

Removed.

default_action

Removed.

Marpa::R2 Grammar methods, alphabetically

This section accounts for each of the Marpa::R2 grammar's methods, in alphabetical order.

g0_rule()

See "Rule and symbol accessors are completely different".

g0_rule_ids()

See "Rule and symbol accessors are completely different".

g1_rule_ids()

See "Rule and symbol accessors are completely different".

new()

See the entries for changes in the grammar named arguments.

parse()

Mostly unchanged.

rule()

See "Rule and symbol accessors are completely different".

rule_expand()

See "Rule and symbol accessors are completely different".

rule_ids()

See "Rule and symbol accessors are completely different".

rule_name()

See "Rule and symbol accessors are completely different".

rule_show()

See "Rule and symbol accessors are completely different".

set()

See the entries for changes in the grammar named arguments.

show_rules()

See "Rule and symbol accessors are completely different".

show_symbols()

See "Rule and symbol accessors are completely different".

start_symbol_id()

See "Rule and symbol accessors are completely different".

symbol_description()

See "Rule and symbol accessors are completely different".

symbol_display_form()

See "Rule and symbol accessors are completely different".

symbol_dsl_form()

See "Rule and symbol accessors are completely different".

symbol_ids()

See "Rule and symbol accessors are completely different".

symbol_name()

See "Rule and symbol accessors are completely different".

Marpa::R2 Recognizer named arguments, alphabetically

This section accounts for each of the Marpa::R2 recognizer's named arguments, in alphabetical order.

end

Mostly unchanged.

event_is_active

Mostly unchanged.

exhaustion

Removed.

grammar

Mostly unchanged.

max_parses

The max_parses recognizer named argument of Marpa::R2 has been removed. In Marpa::R3, it is a named argument of the new valuator objects.

ranking_method

Changed to grammar named argument. high_rule_only option renamed to high_rank_only.

rejection

Removed.

semantics_package

Removed.

too_many_earley_items

Mostly unchanged.

trace_file_handle

Mostly unchanged.

trace_terminals

Mostly unchanged.

trace_values

Mostly unchanged.

Marpa::R2 Recognizer methods, alphabetically

This section accounts for each of the Marpa::R2 recognizer's methods, in alphabetical order.

activate()

Mostly unchanged.

ambiguity_metric()

The Marpa::R2::Recognizer::ambiguity_metric() method has been removed. Its purpose is now served by the valuer's ambiguity_level() method.

ambiguous()

The Marpa::R2::Recognizer::ambiguous() method has been removed. Its purpose is now served by the valuer's ambiguous() method.

current_g1_location()

The Marpa::R2::Recognizer::current_g1_location() method has been removed. Its purpose is now served by the Marpa::R3 recognizer's block_progress() method.

event()

Removed.

events()

Mostly unchanged.

exhausted()

Mostly unchanged.

g1_location_to_span()

Removed.

input_length()

The arguments of $recce-input_length() >>, have changed. Its first parameter now is the block id.

last_completed()

Mostly unchanged.

last_completed_range()

Removed.

last_completed_span()

Removed.

lexeme_alternative()

The interface to $recce-lexeme_alternative() >> has changed. Some of its functionality is taken over by the new $recce-lexeme_alternative_literal() >> method.

lexeme_complete()

The arguments of $recce-lexeme_complete() >>, have changed. Its first parameter now is the block id.

lexeme_priority_set()

Mostly unchanged.

lexeme_read()

The Marpa::R2::Recognizer::lexeme_read() method has been removed. Its function is provided by the new Marpa::R2::Recognizer::lexeme_read_block() and Marpa::R2::Recognizer::lexeme_read_string() methods. Marpa::R2::Recognizer::lexeme_read() may reappear in a non-backward-compatible form.

line_column()

The interface to Marpa::R2::Recognizer::line_column() has changed to allow multi-block input.

literal()

The interface to Marpa::R2::Recognizer::literal() has changed to allow multi-block input.

new()

See the entries for changes in the recognizer named arguments.

pause_lexeme()

Removed.

pause_span()

The Marpa::R2::Recognizer::pause_span() method has been removed. Event location information is now available as arguments to the event handlers.

pos()

Mostly unchanged.

progress()

Marpa::R2::Recognizer::progress() reported the dot position of completions as -1. In Marpa::R3, $slr->progress() reports the dot position of completions as a non-negative integer, consistent with other dot positions. As a reminder, the dot position of a completed production is always the same as its RHS length.

range_to_string()

Removed.

read()

The interface to the $recce->read() method has changed in major ways. One important change is that the slr->read() method may now called multiple times during a parse, each time with a new string. These strings will be called input blocks. For more details see Marpa::R3::Recognizer.

resume()

Mostly unchanged.

series_restart()

Removed.

set()

See the entries for changes in the recognizer named arguments.

show_progress()

Renamed to progress_show(). Also, see "Rule and symbol accessors are completely different".

substring()

$recce->substring() has been renamed $recce->g1_literal().

terminals_expected()

Mostly unchanged.

value()

The recognizer value() method has changed. Most notably, it now throws a fatal error if the parse is ambiguous -- this is what most applications want. Details are in its documentation.

For dealing with ambiguous parses, and other advanced techniques, there is a new value object.

COPYRIGHT AND LICENSE

  Marpa::R3 is Copyright (C) 2018, Jeffrey Kegler.

  This module is free software; you can redistribute it and/or modify it
  under the same terms as Perl 5.10.1. For more details, see the full text
  of the licenses in the directory LICENSES.

  This program is distributed in the hope that it will be
  useful, but without any warranty; without even the implied
  warranty of merchantability or fitness for a particular purpose.