The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Treex::Tool::Parser::MSTperl::ModelAdditional

VERSION

version 0.11336

DESCRIPTION

A model containing edge PMI, i.e. PMI[c,p] = log #[c,p] / #([c,*])#([*,p]) where c=child and p=parent

FIELDS

Public Fields

model_file

The file containing the model, i.e. a TSV file in the format child[tab]parent[tab]PMI

model_format

Currently only tsv is supported. TODO support tsv.gz, probably also Data Dumper model.

buckets

(A reference to) an array of buckets that PMI is bucketed into (negative integers, do not have to be sorted). The PMI is first ceiled, and then it falls into the nearest lower bucket; (if there is no such bucket, falls into the lowest one).

Internal Fields

model

In-memory representation of the model file, in the format model->{child}->{parent} = PMI.

minBucket

The lowest bucket (a bin for all PMIs lower than that).

maxBucket

The highest bucket (a bin for all PMIs higher than that).

value2bucket

Provides fast conversion of ceiled PMIs that are between minBucket and maxBucket to buckets.

METHODS

load
get_value($child, $parent)

Returns the real PMI, i.e. a negative float (there are hundreds of thousands of possible values).

Returns '?' if PMI is unknown.

get_rounded_value($child, $parent)

Returns ceiled PMI, i.e. the integer part of the real PMI (there are about 30 possible values).

Returns '?' if PMI is unknown.

get_bucketed_value($child, $parent)

Returns the nearest bucket that is lower or equal to the ceiled value of the PMI, or the lowest existing bucket if the value is even lower.

Returns '?' if PMI is unknown.

AUTHORS

Rudolf Rosa <rosa@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE

Copyright © 2012 by Institute of Formal and Applied Linguistics, Charles University in Prague

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.