The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::FA::Element - Discrete Finite Automaton

Synopsis

        #!/usr/bin/perl
        
        use strict;
        use warnings;
        
        use Set::FA::Element;
        
        # --------------------------
        
        my($dfa) = Set::FA::Element -> new
        (
         accepting   => ['baz'],
         start       => 'foo',
         transitions =>
         [
          ['foo', 'b', 'bar'],
          ['foo', '.', 'foo'],
          ['bar', 'a', 'foo'],
          ['bar', 'b', 'bar'],
          ['bar', 'c', 'baz'],
          ['baz', '.', 'baz'],
         ],
        );
        
        my($boolean)  = $dfa -> accept($input);
        my($current)  = $dfa -> advance($input);
        my($state)    = $dfa -> current;
        my($boolean)  = $dfa -> final;
        my($acceptor) = $dfa -> final($state);
        my($string)   = $dfa -> match;
        my($current)  = $dfa -> reset;
        my($current)  = $dfa -> state;
        my($boolean)  = $dfa -> state($state);
        my($string)   = $dfa -> step($input);
        my($boolean)  = $dfa -> step_state($next);

Description

Set::FA::Element provides a mechanism to define and run a DFA.

Installation

Install Set::FA as you would for any Perl module:

Run:

        cpanm Set::FA

or run:

        sudo cpan Set::FA

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

Parentage

You can easily subclass Set::FA::Element by having your subclass use exactly the same logic as in the code, - see new(), and _init() - after declaring your getters and setters using the Hash::FieldHash syntax.

Using new()

new() is called as my($dfa) = Set::FA::Element -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type Set::FA::Element.

Key-value pairs accepted in the parameter list are as follows. Also, each is also a method, so you can retrieve the value and update it at any time.

Naturally, after the internal state transition table has been constructed (during the call to new() ), updating some of these fields will be ignored. Methods which are effective later are documented.

o accepting => []

Provides an arrayref of accepting state names.

This key is optional.

The default is [].

o actions => {}

Provides a hashref of entry/exit functions keyed by state name.

This means you can have only 1 entry function and 1 exit function per state.

For a module which gives you the power to have a different entry and exit function for each different regexp which matches the input, see the (as yet unwritten) Set::FA::Manifold.

Format:

o entry => \&function or => [\&function, 'function_name']

The 'entry' key points to a reference to a function to be called upon entry to a state.

Alternately, you can pass in an arrayref, with the function reference as the first element, and a string, e.g. the function name, as the second element.

The point of the [\&fn, 'fn'] version is when you call report(), and the 'fn' string is output.

o exit => \&function or => [\&function, 'function_name']

The 'exit' key points to a reference to a function to be called upon exit from a state.

Alternately, you can pass in an arrayref, with the function reference as the first element, and a string, e.g. the function name, as the second element.

The point of the [\&fn, 'fn'] version is when you call report(), and the 'fn' string is output.

Each of these functions is called (in method step_state() ) with the DFA object as the only parameter.

This key is optional.

The default is {}.

o data => $string

A place to store anything you want, per DFA.

Retrieve and update the value with the data() method.

This key is optional.

The default is ''.

o die_on_loop => $boolean

Provides a way for the code to keep running, or die, when the advance() method determines that input is not being consumed.

Setting die_on_loop to 0 means keep running.

Setting die_on_loop to 1 means the code dies, after outputting an error message.

Retrieve and update the value with the die_on_loop() method.

This key is optional.

The default is 0.

o id => $string

Provides a place to store some sort of identifier per DFA object.

Retrieve and update the value with the id() method.

This key is optional.

The default is ''.

o logger => $logger_object

Provides a logger object whose $level($message) method is called at certain times.

See "Why such a different approach to logging?" in the "FAQ" for details.

Retrieve and update the value with the logger() method.

This key is optional.

The default is ''.

See also the verbose option, which can interact with the logger option.

o start => $name_of_start_state

Provides the name of the start state.

Retrieve and update the value with the start() method.

This key is mandatory.

There is no default.

o transitions => []

Provides a complex arrayref of state names and regexps which control the transitions between states.

Each element of this arrayref is itself an arrayref of 3 elements:

o [0] ($state)

The name of the state, which has to match the 'current' state, before other elements of this arrayref are utilized.

o [1] ($regexp)

The regexp, as a string, against which the input is tested, to determine whether or not to move to the next state.

This string may be a coderef. As such, it should contain 2 pairs of parentheses. The first must capture the matched portion of the input, and the second must capture the unmatched portion of the input.

If it is not a coderef, it is wrapped in qr/($regexp)/ and turned into a coderef which returns the 2 portions of the input as described in the previous sentence.

The code supplies the extra parentheses in the qr// above so that the matched portion of the input can be retrieved with the match() method.

If the regexp does not match, (undef, undef) must be returned by the coderef.

o [2] ($next)

The name of the state to which the DFA will move when the regexp matches the input.

The string which matched, if any, can be retrieved with the match() method.

The name of the new state can be retrieved with the current() method.

This key is mandatory.

There is no default.

o verbose => $boolean

Provides a way to control the amount of output when a logger is not specified.

Setting verbose to 0 means print nothing.

Setting verbose to 1 means print the log level and the message to STDOUT, when a logger is not specified.

This key is optional.

Retrieve and update the value with the verbose() method.

The default is 0.

See also the logger option, which can interact with the verbose option.

Methods

Note: Methods generated by Hash::FieldHash are designed to operate like this:

o When called without a parameter...

They return the value they are managing. Hence:

        my($current_state) = $dfa -> current.
o When called with a parameter...

They return the object, to allow method chaining. Hence:

        $dfa -> current($new_state);
        my($current_state) = $dfa -> current;

Don't do this:

        my($current_state_no_no) = $dfa -> current($new_state);

You could do this:

        my($current_state) = $dfa -> current($new_state) -> current;

All such methods below warn of this.

accept($input)

Calls "advance($input)".

Returns 1 if the 'current' state - after processing the input - is an accepting state.

Returns 0 otherwise.

advance($input)

Calls "step($input)" repeatedly on the unconsumed portion of the input.

Returns the 'current' state at the end of that process.

Since "step($input)" calls "match($consumed_input)" upon every match, and "step_state($next)" too, you necessarily lose access to the individual portions of the input matched by successive runs of the coderef per transition table entry.

At the end of this process, then, "match($consumed_input)" can only return the last portion matched.

See "step($input)" for advancing the DFA by a single transition.

Logging note:

o When die_on_loop is 0

Then, advance() calls $your_logger -> warning($message) when input is not consumed.

If there is no logger, calls print "warning: $message\n". But, when verbose is 0, prints nothing.

o When die_on_loop is 1

Calls die($message).

build_stt()

Use these parameters to new() to construct a state transition table:

o accepting
o actions
o start
o transitions

Note: The private method _init() calls validate_params() before calling build_stt(), so if you call accepting($new_accepting), actions($new_actions), start($new_start) and transtions($new_transitions), for some reason, and then call build_stt(), you will miss out on the benefit of calling validate_params(). So don't do that!

clone()

Returns a deep copy of the Set::FA::Element object.

current([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns the 'current' state of the DFA.

o When $state is provided

Sets the 'current' state of the DFA.

Returns the object, for method chaining.

data([$string])

Here, the [] indicate an optional parameter.

o When $string is not provided

Returns the data associated with the object.

o When $data is provided

Sets the data associated with the object.

Returns the object, for method chaining.

final([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns 1 if the 'current' state is an accepting state.

Returns 0 otherwise.

o When $state is provided

Returns 1 if $state is an accepting state.

Returns 0 otherwise.

id([$id])

Here, the [] indicate an optional parameter.

o When $id is not provided

Returns the id of the object.

o When $id is provided

Sets the id of the object.

Returns the object, for method chaining.

log([$level, $message])

Here, the [] indicate an optional parameters.

If you call it as $dfa -> log(), $level defaults to 'debug' and $message defaults to ''.

Firstly, the error level is checked:

        if ($level eq 'error')
        {
            die $message;
        }

If not an error, log() then executes this line:

        if ($self -> logger)                     # If there is a logger...
        {
            $self -> logger -> $level($message); # Call it.
        }
        elsif ($self -> verbose)                 # Otherwise (no logger) and we're in verbose mode...
        {
            print "$level: $message\n";          # Print.
        }
                                                 # Otherwise (silent) do nothing.

Returns the object, for method chaining.

logger([$logger_object])

Here, the [] indicate an optional parameter.

o When $logger_object is not provided

Sets the internal logger object to ''.

o When $logger_object is provided

Sets the internal logger object to $logger_object.

This allows you to change the log levels accepted and suppressed by the logger object, during the run of the DFA.

You are strongly advised to read "Why such a different approach to logging?", as well as the notes on the logging and verbose options to new().

Returns the internal logger object, or ''.

match([$consumed_input])

Here, the [] indicate an optional parameter.

o When $consumed_input is not provided

Returns the portion of the input matched by the most recent step of the DFA.

o When $consumed_input is provided

Sets the internal string which will be returned by calling match().

Returns the object, for method chaining.

report()

Log the state transition table, at log level 'info'.

reset()

Resets the DFA object to the start state.

Returns the 'current' state, which will be the start state.

Does not reset the id or anything else associated with the object.

start([$start])

Here, the [] indicate an optional parameter.

o When $start is not provided

Returns the start state of the object.

o When $start is provided

Sets the start state of the object.

Returns the object, for method chaining.

state([$state])

Here, the [] indicate an optional parameter.

o When $state is not provided

Returns the 'current' state.

o When $state is provided

Returns 1 if $state was defined in the transitions arrayref supplied to new().

Returns 0 otherwise.

step($input)

Advances the DFA by a single transition, if possible.

The code checks each entry in the transitions arrayref supplied to new(), in order, looking for entries whose 1st element ($state) matches the 'current' state.

Upon the first match found (if any), the code runs the coderef in the 2nd element ($rule_sub) of that entry.

If there is a match:

o Calls "match($consumed_input)" so you can retrieve that value with the match() method
o Calls "step_state($next)", passing in the 3rd element ($next) in that entry as the only parameter

Returns the unconsumed portion of the input.

step_state($next)

Performs these steps:

o Compares the 'current' state to $next

If they match, returns 0 immediately.

o Calls the exit function, if any, of the 'current' state
o Set the 'current' state to $next
o Calls the entry function, if any, of the new 'current' state
o Returns 1.

validate()

Perform validation checks on these parameters to new():

o accepting
o start
o transitions

FAQ

What's changed in V 1.00 of Set::FA::Element?

o Use Hash::FieldHash for getters and setters

Originally, Set::FA::Element used direct hash access to implement the logic. I did not want to do that. At the same time, I did not want users to incur the overhead of Moose.

So, I've adopted my standard policy of using Hash::FieldHash in stand-alone modules and Moose in apps.

o Rename the new() parameter from accept to accepting

While direct hash access allowed the author of Set::FA::Element to have a hash key and a method with the same name, accept, I can't do that now.

So, the parameter to new() (in Set::FA::Element) is called accepting, and the method is still called accept().

o Add a parameter to new(), die_on_loop

This makes it easy to stop a run-away program during development.

o Add a parameter to new(), logger

See below for details.

o Add a parameter to new(), start

This must be used to name the start state.

o Chop the states parameter to new()

The state names are taken from the transitions parameter to new().

o Add a parameter to new(), verbose

This makes it easy to change the quantity of progress reports.

o Add a method, build_stt() to convert new()'s parameters into a state transition table
o Add a method, current() to set/get the current state
o Add a method, data() to set/get the arbitrary data per object
o Add a method, die_on_loop() to set/get the like-named option
o Add a method, id() to set/get the id per object
o Add a method, log() to call the logger object
o Add a method, logger() to set/get the logger object
o Add a method, match(), to retrieve exactly what matched at each transition
o Add a method, report(), to print the state transition table
o Add a method, start() to set/get the start state per object
o Add a method, validate() to validate new()'s parameters
o Add a method, verbose() to set/get the like-named option

Why such a different approach to logging?

Firstly, Set::FA::Element used Log::Agent. I always use Log::Handler these days.

By default (i.e. without a logger object), Set::FA::Element prints messages to STDOUT, and dies upon errors.

However, by supplying a log object, you can capture these events.

Not only that, you can change the behaviour of your log object at any time, by calling "logger($logger_object)".

Specifically, $logger_object -> log(debug => 'Entered x()') is called at the start of each public method.

Setting your log level to 'debug' will cause these messages to appear.

Setting your log level to anything below 'debug', e.g. 'info, 'notice', 'warning' or 'error', will suppress them.

Also, "step_state($next)" calls:

        $self -> log(info => "Stepped from state '$current' to '$next' using rule /$rule/ to match '$match'");

just before it returns.

Setting your log level to anything below 'info', e.g. 'notice', 'warning' or 'error', will suppress this message.

Hence, by setting the log level to 'info', you can log just 1 line per state transition, and no other messages, to minimize output.

Lastly, although this logging mechanism may seem complex, it has distinct advantages:

o A design fault in Log::Agent (used by the previous author):

This method uses a global variable to control the level of logging. This means the code using Set::FA::Element can (also) use Log::Agent and call logconfig(...), which in turn affects the behaviour of the logging calls inside those other modules. I assume this design is deliberate, but I certainly don't like it, because (I suspect) it means any running Perl program which changes the configuration affects all other running programs using Log::Agent.

o Log levels

You can configure your logger object, either before calling new(), or at any later time, by changing your logger object, and then calling "logger($logger_object)".

That allows you complete control over the logging activity.

The only log levels output by this code are (from high to low): debug, info, warning and error.

Error messages are self-documenting, in that when you trigger them, you get to read them...

Machine-Readable Change Log

The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Credit

See "Credit" in Set::FA.

See Also

See "See Also" in Set::FA.

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=Set::FA.

Author

Set::FA::Element was written by Mark Rogaski and Ron Savage <ron@savage.net.au> in 2011.

Home page: http://savage.net.au/index.html.

Copyright

Australian copyright (c) 2011, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html