The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::Interset::Trie - A trie-like structure for DZ Interset features and their values.

VERSION

version 3.015

SYNOPSIS

  use Lingua::Interset::Tagset::EN::Penn;

  my $ts = Lingua::Interset::Tagset::EN::Penn->new();
  # Get a Lingua::Interset::Trie object $permitted and print all feature structures
  # that the tagset en::penn can generate.
  my $permitted = $ts->permitted_structures();
  print($permitted->as_string(), "----------\n");

DESCRIPTION

The Trie class defines a trie-like data structure for DZ Interset features and their values. It is an auxiliary data structure that an outside user should not need to use directly.

It is used to describe all feature-value combinations that are permitted under a given tagset. (Example: If the prefix already traversed in the trie indicates that we have a noun, with subtype of proper noun, what are the possible values of the next feature, say, gender?)

The trie assumes that features are ordered according to their priority. However, the priorities are defined outside the trie, by default in the FeatureStructure class, or they may be overriden in a Tagset subclass. The trie can store features in any order.

ATTRIBUTES

features

An array reference. Lists the features in the order in which their values appear in the trie. This may be the default order according to Lingua::Interset::FeatureStructure->priority_features(), or a custom order.

root_hash

The trie structure is implemented as a tree of hashes. The root hash corresponds to the first feature. Its keys are values of the feature and each of them leads to a second-level hash. All second-level hashes correspond to the second feature, their keys are values of that feature etc. It is interpreted as a sequence of feature queries: If feature F1 has value X, then if feature F2 has value Y, then...

We need a pointer when traversing the trie, and the pointer is always a reference to one of the hashes. The root_hash attribute is our entry pointer where we start the traversal.

METHODS

add_value()

  $trie->add_value ($pointer, $value[, $tag_example]);

Adds a feature value to the trie. It does not need to know the feature name. It takes the feature value and the pointer to the trie level corresponding to the feature (reference to an existing hash). If the hash already has a key corresponding to the value, the method only advances to the sub-hash referenced by the value, and returns the new pointer. If there is no such key, the method first creates the new sub-hash and then advances the pointer.

For the last feature we can optionally provide an example of a tag where this combination of feature values occurred. It may be useful for debugging, when we see a permitted feature structure but do not understand how does it come that it is permitted.

advance_pointer()

  $trie->advance_pointer ($pointer, $feature, $value);

Advances a trie pointer. Normally it observes the value of the current feature; however, the features tagset and other get special treatment (any value is permitted).

as_string()

Returns permitted feature values in a form suitable for printing. This may be useful for debugging.

get_tag_example()

  $trie->get_tag_example ($feature_structure);

Takes a Lingua::Interset::FeatureStructure object. If this is a permitted structure according to the trie, the method returns the tag example that has been stored in the last-level hash of the trie.

Otherwise it returns an error message. This is a debugging method and it will not throw exceptions on forbidden values.

SEE ALSO

Lingua::Interset::Tagset, Lingua::Interset::FeatureStructure

AUTHOR

Dan Zeman <zeman@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE

This software is copyright (c) 2019 by Univerzita Karlova (Charles University).

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.