The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::FA::Element - Discrete Finite Automaton

Synopsis

This is scripts/synopsis.2.pl (a test-free version of t/report.t):

        #!/usr/bin/env perl

        use strict;
        use warnings;

        use Set::FA::Element;

        # --------------------------

        my($dfa) = Set::FA::Element -> new
        (
                accepting       => ['baz'],
                maxlevel        => 'debug',
                start           => 'foo',
                transitions     =>
                [
                        ['foo', 'b', 'bar'],
                        ['foo', '.', 'foo'],
                        ['bar', 'a', 'foo'],
                        ['bar', 'b', 'bar'],
                        ['bar', 'c', 'baz'],
                        ['baz', '.', 'baz'],
                ],
        );

        print "Got: \n";
        $dfa -> report;

        print "Expected: \n", <<EOS;
        Entered report()
        State Transition Table
        State: bar
        Rule => Next state
        /a/ => foo
        /b/ => bar
        /c/ => baz
        State: baz. This is an accepting state
        Rule => Next state
        /./ => baz
        State: foo. This is the start state
        Rule => Next state
        /b/ => bar
        /./ => foo
        EOS

        Or make use of:

        my($boolean)  = $dfa -> accept($input);
        my($current)  = $dfa -> advance($input);
        my($state)    = $dfa -> current;
        my($boolean)  = $dfa -> final;
        my($acceptor) = $dfa -> final($state);
        my($string)   = $dfa -> match;
        my($current)  = $dfa -> reset;
        my($current)  = $dfa -> state;
        my($boolean)  = $dfa -> state($state);
        my($string)   = $dfa -> step($input);
        my($boolean)  = $dfa -> step_state($next);

Description

Set::FA::Element provides a mechanism to define and run a DFA.

Installation

Install Set::FA as you would for any Perl module:

Run:

        cpanm Set::FA

or run:

        sudo cpan Set::FA

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

Parentage

You can easily subclass Set::FA::Element by having your subclass use exactly the same logic as in the code, after declaring your Moo-based getters and setters.

Using new()

new() is called as my($dfa) = Set::FA::Element -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Set::FA::Element.

Key-value pairs accepted in the parameter list are as follows. Also, each is also a method, so you can retrieve the value and update it at any time.

Naturally, after the internal state transition table has been constructed (during the call to new()), updates to some of these fields will be ignored. Methods which are effective later are documented.

o accepting => []

Provides an arrayref of accepting state names.

This key is optional.

The default is [].

o actions => {}

Provides a hashref of entry/exit functions keyed by state name.

This means you can have only 1 entry function and 1 exit function per state.

For a module which gives you the power to have a different entry and exit function for each different regexp which matches the input, see the (as yet unwritten) Set::FA::Manifold.

For a given state name key, the value is a hashref with 1 or 2 of these keys:

o entry => \&function or => [\&function, 'function_name']

The 'entry' key points to a reference to a function to be called upon entry to a state.

Alternately, you can pass in an arrayref, with the function reference as the first element, and a string, e.g. the function name, as the second element.

The point of the [\&fn, 'fn'] version is when you call report(), and the 'fn' string is output.

o exit => \&function or => [\&function, 'function_name']

The 'exit' key points to a reference to a function to be called upon exit from a state.

Alternately, you can pass in an arrayref, with the function reference as the first element, and a string, e.g. the function name, as the second element.

The point of the [\&fn, 'fn'] version is when you call report(), and the 'fn' string is output.

Each of these functions is called (in method step_state() ) with the DFA object as the only parameter. You use that object to call the methods listed in these docs. See "Synopsis" for a list.

This key is optional.

The default is {}.

o die_on_loop => $boolean

Provides a way for the code to keep running, or die, when the advance() method determines that input is not being consumed.

Setting die_on_loop to 0 means keep running.

Setting die_on_loop to 1 means the code dies, after outputting an error message.

Retrieve and update the value with the die_on_loop() method.

This key is optional.

The default is 0.

o id => $string

Provides a place to store some sort of identifier per DFA object.

Retrieve and update the value with the id() method.

This key is optional.

The default is ''.

o logger => $aLoggerObject

Specify a logger compatible with Log::Handler, for the lexer and parser to use.

Default: A logger of type Log::Handler which writes to the screen.

To disable logging, just set 'logger' to the empty string (not undef).

o maxlevel => $logOption1

This option affects Log::Handler.

See the Log::Handler::Levels docs.

By default nothing is printed.

Typical values are: 'error', 'notice', 'info' and 'debug'.

The default produces no output.

Default: 'notice'.

o minlevel => $logOption2

This option affects Log::Handler.

See the Log::Handler::Levels docs.

Default: 'error'.

No lower levels are used.

o start => $name_of_start_state

Provides the name of the start state.

Retrieve and update the value with the start() method.

This key is mandatory.

There is no default.

o transitions => []

Provides a complex arrayref of state names and regexps which control the transitions between states.

Each element of this arrayref is itself an arrayref of 3 elements:

o [0] ($state)

The name of the state, which has to match the 'current' state, before other elements of this arrayref are utilized.

o [1] ($regexp)

The regexp, as a string, against which the input is tested, to determine whether or not to move to the next state.

This string may be a coderef. As such, it should contain 2 pairs of parentheses. The first must capture the matched portion of the input, and the second must capture the unmatched portion of the input.

If it is not a coderef, it is wrapped in qr/($regexp)/ and turned into a coderef which returns the 2 portions of the input as described in the previous sentence.

The code supplies the extra parentheses in the qr// above so that the matched portion of the input can be retrieved with the match() method.

If the regexp does not match, (undef, undef) must be returned by the coderef.

o [2] ($next)

The name of the state to which the DFA will move when the regexp matches the input.

The string which matched, if any, can be retrieved with the match() method.

The name of the new state can be retrieved with the current() method.

This key is mandatory.

There is no default.

Methods

accept($input)

Calls "advance($input)".

Returns 1 if the 'current' state - after processing the input - is an accepting state.

Returns 0 otherwise.

accepting($arrayref_of_states)

See "Using new()" for details.

accepting is a parameter to "new([%args])".

actions($arrayref_of_states)

See "Using new()" for details.

actions is a parameter to "new([%args])".

advance($input)

Calls "step($input)" repeatedly on the unconsumed portion of the input.

Returns the 'current' state at the end of that process.

Since "step($input)" calls "match($consumed_input)" upon every match, and "step_state($next)" too, you necessarily lose access to the individual portions of the input matched by successive runs of the coderef per transition table entry.

At the end of this process, then, "match($consumed_input)" can only return the last portion matched.

See "step($input)" for advancing the DFA by a single transition.

Logging note:

o When die_on_loop is 0

Then, advance() calls $your_logger -> log(warning => $message) when input is not consumed.

o When die_on_loop is 1

Calls die($message).

build_stt()

Use these parameters to new() to construct a state transition table:

o accepting
o actions
o start
o transitions

Note: The private method _init() calls validate_params() before calling build_stt(), so if you call accepting($new_accepting), actions($new_actions), start($new_start) and transtions($new_transitions), for some reason, and then call build_stt(), you will miss out on the benefit of calling validate_params(). So don't do that!

current([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns the 'current' state of the DFA.

o When $state is provided

Sets the 'current' state of the DFA.

die_on_loop([$Boolean])

See "Using new()" for details. See also "advance($input)" for a discussion of die_on_loop.

die_on_loop is a parameter to "new([%args])".

final([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns 1 if the 'current' state is an accepting state.

Returns 0 otherwise.

o When $state is provided

Returns 1 if $state is an accepting state.

Returns 0 otherwise.

id([$id])

Here, the [] indicate an optional parameter.

See "Using new()" for details.

id is a parameter to "new([%args])".

log($level, $message)

Calls log($level, $message) on the logger object if that object is defined.

To stop this, set the logger to '' in the call to "new([%args])".

logger($arrayref_of_states)

See "Using new()" for details.

logger is a parameter to "new([%args])".

maxlevel([$string])

Here, the [] indicate an optional parameter.

See "Using new()" for details.

Get or set the value used by the logger object.

This option is only used if an object of type Log::Handler is ceated, and such an object is created by default. To stop this, set the logger to '' in the call to "new([%args])".

See Log::Handler::Levels.

Typical values are: 'notice', 'info' and 'debug'. The default, 'notice', produces no output.

maxlevel is a parameter to "new([%args])".

minlevel([$string])

Here, the [] indicate an optional parameter.

See "Using new()" for details.

Get or set the value used by the logger object.

This option is only used if an object of type Log::Handler is ceated, and such an object is created by default. To stop this, set the logger to '' in the call to "new([%args])".

See Log::Handler::Levels.

minlevel is a parameter to "new([%args])".

new([%args])

The constructor. See "Constructor and Initialization".

match([$consumed_input])

Here, the [] indicate an optional parameter.

o When $consumed_input is not provided

Returns the portion of the input matched by the most recent step of the DFA.

o When $consumed_input is provided

Sets the internal string which will be returned by calling match().

report()

Log the state transition table, at log level 'info'.

reset()

Resets the DFA object to the start state.

Returns the 'current' state, which will be the start state.

Does not reset the id or anything else associated with the object.

start([$start])

Here, the [] indicate an optional parameter.

See "Using new()" for details.

start is a parameter to "new([%args])".

state([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns the 'current' state.

o When $state is provided

Returns 1 if $state was defined in the transitions arrayref supplied to new().

Returns 0 otherwise.

step($input)

Advances the DFA by a single transition, if possible.

The code checks each entry in the transitions arrayref supplied to new(), in order, looking for entries whose 1st element ($state) matches the 'current' state.

Upon the first match found (if any), the code runs the coderef in the 2nd element ($rule_sub) of that entry.

If there is a match:

o Calls "match($consumed_input)" so you can retrieve that value with the match() method
o Calls "step_state($next)", passing in the 3rd element ($next) in that entry as the only parameter

Returns the unconsumed portion of the input.

step_state($next)

Performs these steps:

o Compares the 'current' state to $next

If they match, returns 0 immediately.

o Calls the exit function, if any, of the 'current' state
o Set the 'current' state to $next
o Calls the entry function, if any, of the new 'current' state
o Returns 1.

transitions($arrayref_of_states)

See "Using new()" for details.

transitions is a parameter to "new([%args])".

validate()

Perform validation checks on these parameters to new():

o accepting
o start
o transitions

FAQ

How do I protect the code from dying?

Use Capture::Tiny. See t/report.t for a simple example.

What's changed in V 2.00 of Set::FA::Element?

o Put the distro on github

See "Repository" below.

o Switch to Moo
o This means method chaining is no longer supported
o Explicitly default the logger to an instance of Log::Handler
o Add maxlevel() and minlevel() for controlling the logger
o Rewrite log so messages do not have the prefix of "$level: "
o Make the synopses in both modules into stand-alone scripts

See scripts/synopsis.*.pl.

o Add 'use strict' and 'use warnings' to t/*.t
o Move t/pod.t to xt/author/
o Switch from Test::More to Test::Stream
o Remove verbose()

What's changed in V 1.00 of Set::FA::Element?

Note: I have switched to Moo and Log::Handler.

o Use Moo for getters and setters

Originally, Set::FA::Element used direct hash access to implement the logic. I did not want to do that. At the same time, I did not want users to incur the overhead of Moose.

So, I've adopted my standard policy of using Moo.

o Rename the new() parameter from accept to accepting

While direct hash access allowed the author of Set::FA::Element to have a hash key and a method with the same name, accept, I can't do that now.

So, the parameter to new() (in Set::FA::Element) is called accepting, and the method is still called accept().

o Add a parameter to new(), die_on_loop

This makes it easy to stop a run-away program during development.

o Add a parameter to new(), logger

See below for details.

o Add a parameter to new(), start

This must be used to name the start state.

o Chop the states parameter to new()

The state names are taken from the transitions parameter to new().

o Add a parameter to new(), verbose

This makes it easy to change the quantity of progress reports.

o Add a method, build_stt() to convert new()'s parameters into a state transition table
o Add a method, current() to set/get the current state
o Add a method, die_on_loop() to set/get the like-named option
o Add a method, id() to set/get the id per object
o Add a method, log() to call the logger object
o Add a method, logger() to set/get the logger object
o Add a method, match(), to retrieve exactly what matched at each transition
o Add a method, report(), to print the state transition table
o Add a method, start() to set/get the start state per object
o Add a method, validate() to validate new()'s parameters
o Add a method, verbose() to set/get the like-named option

Why such a different approach to logging?

Firstly, Set::FA::Element used Log::Agent. I always use Log::Handler these days.

By default (i.e. without a logger object), Set::FA::Element prints messages to STDOUT, and dies upon errors.

However, by supplying a log object, you can capture these events.

Not only that, you can change the behaviour of your log object at any time, by calling "logger($logger_object)".

Specifically, $logger_object -> log(debug => 'Entered x()') is called at the start of each public method.

Setting your log level to 'debug' will cause these messages to appear.

Setting your log level to anything below 'debug', e.g. 'info, 'notice', 'warning' or 'error', will suppress them.

Also, "step_state($next)" calls:

        $self -> log(info => "Stepped from state '$current' to '$next' using rule /$rule/ to match
        '$match'");

just before it returns.

Setting your log level to anything below 'info', e.g. 'notice', 'warning' or 'error', will suppress this message.

Hence, by setting the log level to 'info', you can log just 1 line per state transition, and no other messages, to minimize output.

Lastly, although this logging mechanism may seem complex, it has distinct advantages:

o A design fault in Log::Agent (used by the previous author):

This method uses a global variable to control the level of logging. This means the code using Set::FA::Element can (also) use Log::Agent and call logconfig(...), which in turn affects the behaviour of the logging calls inside those other modules. I assume this design is deliberate, but I certainly don't like it, because (I suspect) it means any running Perl program which changes the configuration affects all other running programs using Log::Agent.

o Log levels

You can configure your logger object, either before calling new(), or at any later time, by changing "minlevel([$string])" or "maxlevel([$string])".

That allows you complete control over the logging activity.

The only log levels output by this code are (from high to low): debug, info, warning and error.

Error messages are self-documenting, in that when you trigger them, you get to read them...

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Credit

See "Credit" in Set::FA.

See Also

See "See Also" in Set::FA.

Repository

https://github.com/ronsavage/Set-FA

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=Set::FA

Author

Set::FA::Element was written by Mark Rogaski and Ron Savage <ron@savage.net.au> in 2011.

My homepage: http://savage.net.au/index.html

Copyright

Australian copyright (c) 2011, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Perl License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html