The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MarpaX::ESLIF::Recognizer::Interface - MarpaX::ESLIF's recognizer interface

VERSION

version 6.0.33.4

DESCRIPTION

Recognizer interface is a list of methods that are required by MarpaX::ESLIF at run-time when it needs more data. It has to be an object instance, referenced with $recognizerInterface below.

METHODS

$recognizerInterface->read()

Performs read of user data and returns a true value on success, a false value otherwise. $recognizerInterface is responsible to maintain the status in terms of: data content, data type (binary or character), eof flag, eventual encoding information, that are queried using the following methods:

$recognizerInterface->isEof()

Returns a boolean value indicating the end of the stream.

$recognizerInterface->isCharacterStream()

Returns a boolean value indicating if current chunk is a character stream or not.

$recognizerInterface->encoding()

Encoding of latest chunk of data, when the later is a character chunk. It is legal to return undef.

If current chunk of data is a character stream, and this method returns undef, then marpaESLIF will either:

guess the encoding if there previous chunk of data was not a character stream,
continue with previous encoding if previous chunk of data was a character stream
$recognizerInterface->data()

Returns data content of current chunk, may be of zero size.

$recognizerInterface->isWithDisableThreshold()

Returns a boolean indicating if threshold warnings should be fired. This is called once at the very beginning of a recognizer lifetime.

$recognizerInterface->isWithExhaustion()

Returns a boolean indicating if exhaustion should trigger an exhaustion event. This is called once at the very beginning of a recognizer lifetime.

When the parse is exhausted, the normal behavior is to exit with an error if the eof flag is not set. This method is saying that an exhaustion event should be raised instead, and is used at recognizer creation step only.

$recognizerInterface->isWithNewline()

Returns a boolean indicating if line/number accounting is on. This is called once at the very beginning of a recognizer lifetime.

Error reporting can be accurate up to line and column numbers when this is happening on a character stream enabled chunk of data. This is handy, but has an extra cost on parsing performance. This method is used at recognizer creation step only.

$recognizerInterface->isWithTrack()

Returns a boolean indicating if absolute position tracking is on. This is called once at the very beginning of a recognizer lifetime.

Absolute position tracking must be on if you plan to use one of the lastCompletedOffset(), lastCompletedLength() or lastCompletedLocation() recognizer methods. The information returned by these methods is not fully reliable, because ESLIF will not check if there is a turnaround with associated internal variables.

$recognizerInterface->setRecognizer($recognizer)

This method is not required, and only necessary if you want to use the current recognizer instance in a callback. Recognizer callbacks all run in the interface namespace, and the current recognizer might be unknown to you if you used the MarpaX::ESLIF::Grammar parse method, or ambiguous if you are sharing recognizers, using e.g. MarpaX::ESLIF::Recognizer newFrom method.

If the method setRecognizer exists in this interface, ESLIF will call it with a special MarpaX::ESLIF::Recognizer instance, that is shallowing the current recognizer.

It is then advisable to do a getRecognizer method to retreive the value, e:g:

  sub setRecognizer { my ($self, $recognizer) = @_; $self->{recognizer} = $recognizer; }
  sub getRecognizer { my ($self) = shift; $self->{recognizer} }
$recognizerInterface->resolver($action)

This method is not required, and only necessary if you want to use callbacks without polluting a package stash.

Recognizer callbacks to perl are are resolved this way:

  • If the resolved method exist, it is used to get the callback for a given action $action.

  • If the return value of the resolver is a code reference, it is used. Else it will be assumed that the callback is available in package's stash.

NOTES

The isWithDisableThreshold(), isWithExhaustion(), isWithNewline() and isWithTrack() methods are always called once at the very beginning of a recognizer processing.

The read(), encoding(), isEof() and isCharacterStream() methods are always called whenever the parser needs more data, and in this order.

AUTHOR

Jean-Damien Durand <jeandamiendurand@free.fr>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Jean-Damien Durand.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.