++ed by:
AZAWAWI DOUGDUDE DRTECH PERLANCAR

4 PAUSE users
1 non-PAUSE user.

Author image Jeffrey Kegler
and 1 contributors

NAME

Parse::Marpa::Doc::To_Do - Marpa's To Do List

PRINCIPLES

MDLex and MDL will know about Marpa, but not vice versa.

ORDERED TASKS

  • Add OBJECT flag to Grammar.

  • Make SELF_ARG flag operative. If on, first arg of actions is be the action object, or an unblessed hash if no object. Default to off.

  • Convert tests, ../bin/mdl, ::mdl(), etc. (everything but MDL bootstrap) to use SELF_ARG. The MDL will be replaced after conversion to the unfactored MDL.

  • Allow multiple hashes as args to Grammar constructor as mechanism for specifying an order in which the options should be applied. I hope this will be sufficient as an interface for front-ends to Marpa like MDL.

  • Factor out MDL so that it works by calling (or creating options for a call to) Grammar::new or Grammar::set.

  • Default to SELF_ARG on. Add code to abend if SELF_ARG is off. Test. Fix what breaks.

  • Test: Are regexes Storable's? -- NO

  • Allow cloning of unstripped recognizers.

  • Convert cloning to Storable.

  • Remove not longer needed "strip => 0" options.

  • Convert all stringization to Storable.

  • After MDL is factored out, and stripping & cloning is fully functional, create a matrix of tests of valid grammars vs. their result. Matrix is

        grammars X inputs X clone grammar X clone recognizer X strip grammar X strip recognizer

    Do just one or two inputs per grammar.

  • MDLex currently does not sort terminals by priority. This means terminal priorities are totally unused. Change this?

HIGH PRIORITY

  • Implement an recognizer option to simply skip ahead spans where no terminal can possibly be found. This means coming up with some way of passing Marpa's idea of where the next terminal should be expected to the lexer. Also the lexer should be able to inform Marpa which earleme a token is at.

    Use PREDICTIVE flag to the Recognizer to enable this?

  • Allow action options to be changed when evaluator is created, but not afterwards.

  • Create a "stress test" for zero-width and- and or- nodes: A grammar with two rules in a cycle. Each rule is all nulling (nullable?) symbols except for the last symbol, which must be non-nulling.

    I may assume in several places and zero-length and-nodes have no cause or predecessor. I have to find any places where this assumption is being made, and change it. In addition to the above test, I may want to reread the relevant evaluator code.

  • Code to ensure zero-width or-nodes have unique parents seems to assume that the child and-nodes have no cause or predecessor. I can't assume this. I need to create a work list or or-nodes, and re-add those when cloned and-nodes are added to their parent list.

    I can assume no cycles. Reason: Marpa does not allow zero-length rules, and cycles in the bocage only occur when rules derive rules. Breaking up rules into and-nodes with at most two children will not create cycles. Rules do not cycle internally. A sequence of and- and or- nodes will descend via predecessors until it hit the beginning of the rule, and then stop.

  • I should start to assume that anyone explicitly using separate recognizer and evaluator stages knows what they are doing. This means:

    • Revise the documentation accordingly. Just because they are using the plumbing interface does not mean they want to know about the recognizer/evaluator distinction.

    • No auto precomputation in the Grammar module. Just abend with an error message.

    • No auto cloning? Assume most users will have cloning handled as necessary at a higher level.

    • No automatic end_input when the evaluator is called. Just abend with an error message.

  • Make minimal work for null symbols.

  • Forbid simulatenous MAXIMAL & MINIMAL setting on rules & symbols. Combine them internally as ::GREEDY

  • Add ah_minimal.t -- ah2.t with minimal set on the grammar

  • Add optimizations when max_parses <= 1 For example, no need to prune duplicate parses

  • Or_map's should be hashes instead of arrays.

  • Add MAX_COUNT for rules. Implement on one of left- and right- recursion only. No immediate need for both.

MEDIUM PRIORITY

  • In the MDL grammar, concatenate_lines is called uselessly in a number of places. Eliminate this, for efficiency's sake?

  • bin/mdl in parsing equation.marpa (which is used for many examples in the author.t/Makefile) takes a final return and parses past it, but does not fail.

    The return is NOT in the grammar. What happens is that the call to ::text() is not recognized as a failure, and then the evaluator looks at the last successful parse.

    What to do here? It should be failing, probably. And does the current behavior of the evaluator, (taking the default end as FURTHEST_EARLEME) set the user up for spurious successes.

  • Add a SYNOPSIS to the Plumbing document.

  • When rereading Internals doc, check if parse bocage creation can probably be made a bit slightly cleaner.

  • Make sure that nulling symbols can never be terminals

LOW PRIORITY

  • Add a trace_choices option? There was an option to trace non-trivial evaluation choices in the old evaluator, and the new trace_iterations doesn't entirely replace it.

  • Add a show_derivation option.

MAYBE, MAYBE NOT

  • Test lexing suffixes? Remove them?

  • Speed-up for pre-computing lexables? Predict lexables based on user request?

  • show_tree before first call to value? Should it cause an error message? How about after unsuccessful call to value?

OTHER FROZEN

Downgrade MDL version conflict or semantics mismatch to warning?

Probably not. MDL is EOL'ed.

Lifting Restrictions on Sequence Productions

The restriction of sequences to sequence productions and of sequence productions to a single sequence is not the result of any limit of the Marpa parse engine. It would not be hard to allow any number of sequences and optional sequences on the right hand side of any BNF production. I'm open to revisiting this issue and lifting the restriction.

The problem is figuring out how to conveniently specify their semantics. As the right hand side of a production grows more complex, the semantics becomes more complex to write, more bug-prone, and harder to debug.

SUPPORT

See the support section in the main module.

AUTHOR

Jeffrey Kegler

LICENSE AND COPYRIGHT

Copyright 2007 - 2008 Jeffrey Kegler

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0.