The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::Awkwords - randomly generates outputs from a given pattern

SYNOPSIS

  use feature qw(say);
  use Lingua::Awkwords;
  use Lingua::Awkwords::Subpattern;

  # V is a pre-defined subpattern, ^ filters out aa from the list
  # of two vowels that the two VV generate
  my $la = Lingua::Awkwords->new( pattern => q{ [VV]^aa } );

  say $la->render for 1..10;

  # define our own C, V
  Lingua::Awkwords::Subpattern->set_patterns(
      C => [qw/j k l m n p s t w/],
      V => [qw/a e i o u/],
  );
  # and a pattern somewhat suitable for Toki Pona...
  $la->pattern(q{
      [a/*2]
      (CV*5)^ji^ti^wo^wu
      (CV*2)^ji^ti^wo^wu
      [CV/*2]^ji^ti^wo^wu
      [n/*5]
  });

  say $la->render for 1..10;

DESCRIPTION

This is a Perl implementation of

http://akana.conlang.org/tools/awkwords/

though is not an exact replica of that parser;

http://akana.conlang.org/tools/awkwords/help.html

details the format that this code is based on. Briefly,

SYNTAX

[] or ()

Denote a unit or group; they are identical except that (a) is equivalent to [a/]--that is, it represents the possibility of generating the empty string in addition to any other terms supplied.

Units can be nested recursively. There is an implicit unit at the top level of the pattern.

/

Introduces a choice within a unit; without this [Vx] would generate whatever V represents (a list of vowels by default) followed by the letter x while [V/x] by contrast generates only a vowel or the letter x.

*

The asterisk followed by an integer in the range 1..128 inclusive weights the current term of the alternation, if any. That is, while [a/] generates each term with equal probability, [a/*2] would generate the empty string at twice the probability of the letter a.

^

The caret introduces a filter that must follow a unit (there is an implicit unit at the top level of a pattern). An example would be [VV]^aa or the equivalent VV^aa that (by default) generates two vowels, but replaces aa with the empty string. More than one filter may be specified.

A-Z

Capital ASCII letters denote subpatterns; several of these are set by default. See Lingua::Awkwords::Subpattern for how to customize them. V for example is by default equivalent to the more verbose [a/i/u].

"

Use double quotes to denote a quoted string; this prevents other characters (besides " itself) from being interpreted as some non- string value.

anything-else

Anything else not otherwise accounted for above is treated as part of a string, so ["abc"/abc] generates either the string abc or the string abc, as this is two ways of saying the same thing.

ATTRIBUTES

pattern

Awkword pattern. Without this supplied any call to render will throw an exception.

tree

Where the parse tree is stored.

FUNCTIONS

set_filter

Utility routine for use with walk. Returns a subroutine that sets the filter_with attribute to the given value.

  $la->walk( Lingua::Awkwords::set_filter('X') );

METHODS

new

Constructor. Typically this should be passed a pattern argument.

parse_string pattern

Returns the parse tree of the given pattern without setting the tree attribute. "COMPLICATIONS" shows one use for this.

render

Returns a string render of the awkword pattern. This may be the empty string if filters have removed all the text.

walk callback

Provides a means to recurse through the parse tree, where every object in the tree will call the callback with $self as the sole argument, and then if necessary iterate through all of the possibilities contained by itself calling walk on each of those.

COMPLICATIONS

More complicated structures can be built by attaching parse trees to subpatterns. For example, Toki Pona could be extended to allow optional diphthongs (mostly in the second syllable) via

  use feature qw(say);
  use Lingua::Awkwords::Subpattern;
  use Lingua::Awkwords;
  
  my $cv  = Lingua::Awkwords->parse_string(q{
      CV^ji^ti^wo^wu
  }); 
  my $cvv = Lingua::Awkwords->parse_string(q{
      CVV^ji^ti^wo^wu^aa^ee^ii^oo^uu
  });

  Lingua::Awkwords::Subpattern->set_patterns(
      A => $cv,
      B => $cvv,
      C => [qw/j k l m n p s t w/],
      V => [qw/a e i o u/],
  );

  my $tree = Lingua::Awkwords->new( pattern => q{
      [ a[B/BA/BAA/A/AA/AAA] / [AB/ABA/ABAA/A/AA/AAA] ] [n/*5]
  });

  say join ' ', map { $tree->render } 1 .. 10;

The default filter of the empty string can be problematical, as one may not know whether a filter has been applied to the result, or the word may be filtered into an incorrect form. Consult the eg/ directory of this module's distribution for example code that customizes the filter value.

BUGS

Reporting Bugs

Please report any bugs or feature requests to bug-lingua-awkwords at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Awkwords.

Patches might best be applied towards:

https://github.com/thrig/Lingua-Awkwords

Known Issues

There are various incompatibilities with the original version of the code; these are detailed in the parser module as they concern how e.g. weights are parsed.

See also the "Known Issues" section in all the other modules in this distribution.

SEE ALSO

Lingua::Awkwords::ListOf, Lingua::Awkwords::OneOf, Lingua::Awkwords::Parser, Lingua::Awkwords::String, Lingua::Awkwords::Subpattern

AUTHOR

thrig - Jeremy Mates (cpan:JMATES) <jmates at cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2017 by Jeremy Mates

This program is distributed under the (Revised) BSD License: http://www.opensource.org/licenses/BSD-3-Clause