The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WordNet::SenseRelate::TargetWord - Perl module for performing word sense disambiguation.

SYNOPSIS

  use WordNet::SenseRelate::TargetWord;
  
  $tool = WordNet::SenseRelate::TargetWord->new();

  $sense = $tool->disambiguate($instance);

DESCRIPTION

WordNet::SenseRelate::TargetWord combines the different parts of the word sense disambiguation process. It allows the user to select the disambiguation algorithm, the context selection algorithm, and other data processing tasks. This module applies these to the context and returns the selected sense.

USING THE API (WITH EXAMPLE CODE)

The WordNet::SenseRelate::TargetWord module handles the managerial task of initializing the processing modules, initializing the data and passing it between modules. The following pieces of code can serve as a guide for using the module to disambiguate a word within its context.

We would start by initializing the module:

  use WordNet::SenseRelate::TargetWord;

  # Create a hash with the config options
  my %wsd_options = (preprocess => [],
                     preprocessconfig => [],
                     context => 'WordNet::SenseRelate::Context::NearestWords',
                     contextconfig => {(windowsize => 5,
                                       contextpos => 'n')},
                     algorithm => 'WordNet::SenseRelate::Algorithm::Local',
                     algorithmconfig => {(measure => 'WordNet::Similarity::res')});

  # Initialize the object
  my ($wsd, $error) = WordNet::SenseRelate::TargetWord->new(\%wsd_options, 0);

In the current implementation, an "instance" is a hash reference with these fields: "text", "words", "head", "target", "wordobjects", "lexelt", "id", "answer" and "targetpos". The values of the hash reference corresponding to "text", "words" and "wordobjects" are array references. The remaining values are scalars. So an instance object can be created like so:

  my $hashRef = {};             # Creates a reference to an empty hash.
  $hashRef->{text} = [];        # Value is an empty array ref.
  $hashRef->{words} = [];       # Value is an empty array ref.
  $hashRef->{wordobjects} = []; # Value is an empty array ref.
  $hashRef->{head} = -1;        # Index into the text array (initialized to -1)
  $hashRef->{target} = -1;      # Index into the words & wordobjects arrays (initialized to -1)
  $hashRef->{lexelt} = "";      # Lexical element (terminology from Senseval2)
  $hashRef->{id} = "";          # Some ID assigned to this instance
  $hashRef->{answer} = "";      # Answer key (only required for evaluation)
  $hashRef->{targetpos} = "";   # Part-of-speech of the target word (if known).

The ones that are important to us are wordobjects and target. The wordobjects array is an array of WordNet::SenseRelate::Word objects. Given a word (say "bank"), a WordNet::SenseRelate::Word object can be created like this:

  use WordNet::SenseRelate::Word;

  my $wordobj = WordNet::SenseRelate::Word->new("bank");

The wordobject array represents a sentence/paragraph containing the word to be disambiguated. The target field is an index into this array, pointing to the word to be disambiguated. So, for a given example sentence, the disambiguation code would be as follows:

  my @sentence = ("The", "boat", "ran", "aground", "on", "the", "river", "bank");
  foreach my $theword (@sentence)
  {
    my $wordobj = WordNet::SenseRelate::Word->new($theword);
    push(@{$hashRef->{wordobjects}}, $wordobj);
    push(@{$hashRef->{words}}, $theword);
  }
  $hashRef->{target} = 7;        # Index of "bank"
  $hashRef->{id} = "Instance1";  # ID can be any string.

The remaining fields are not really used by the system, but they could be initialized (for use later in the system):

  $hashRef->{lexelt} = "bank.n";
  $hashRef->{answer} = "bank#n#1";
  $hashRef->{targetpos} = "n";        # n, v, a or r
  $hashRef->{text} = [("The boat ran aground on the river", "bank")];
  $hashRef->{head} = 1;               # Index to bank

Finally, the disambiguation is done as follows:

  my ($sense, $error) = $wsd->disambiguate($hashRef);
  print "$sense\n";

The scalar $sense contains the selected sense of the target word, and can be processed as required.

EXPORT

None by default.

SEE ALSO

perl(1)

WordNet::Similarity(3)

http://wordnet.princeton.edu

http://senserelate.sourceforge.net

http://groups.yahoo.com/group/senserelate

AUTHOR

Ted Pedersen, tpederse at d.umn.edu

Siddharth Patwardhan, sidd at cs.utah.edu

Satanjeev Banerjee, banerjee+ at cs.cmu.edu

COPYRIGHT AND LICENSE

Copyright (c) 2005 by Ted Pedersen, Siddharth Patwardhan and Satanjeev Banerjee

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.3 or, at your option, any later version of Perl 5 you may have available.