Ted Pedersen


WordNet::Similarity::DepthFinder - methods to find the depth of synsets in WordNet taxonomies


 use WordNet::QueryData;
 my $wn = WordNet::QueryData->new;
 defined $wn or die "Construction of WordNet::QueryData failed";

 use WordNet::Similarity::DepthFinder;

 my $obj = WordNet::Similarity::DepthFinder->new ($wn);
 my ($err, $errString) = $obj->getError ();
 $err and die $errString;

 my $wps1 = 'car#n#4';
 my $wps2 = 'oil#n#1';

 my $offset1 = $wn -> offset ($wps1);
 my $offset2= $wn -> offset ($wps2);

 my @roots = $obj->getTaxonomies ($offset1, 'n');
 my $taxonomy_depth = $obj->getTaxonomyDepth ($roots[0], 'n');
 print "The maximum depth of the taxonomy where $wps1 is found is $taxonomy_depth\n";

 my @depths = $obj->getSynsetDepth ($offset1, 'n');
 print "The depth of $offset1 is $depths[0]->[0]\n";

 my @lcsbyic = $obj -> getLCSbyDepth($wps1,$wps2,'n','wps');
 print "$wps1 and $wps2 have LCS $lcsbyic[0]->[0] with Depth $lcsbyic[0]->[1]\n";

 my @lcsbyic = $obj -> getLCSbyDepth($offset1,$offset2,'n','offset');
 print "$offset1 and $offset2 have LCS $lcsbyic[0]->[0] with Depth $lcsbyic[0]->[1]\n";


The following methods are provided by this module:

$obj->initialize ($configfile)

Overrides the initialize method in WordNet::Similarity to look for and process depths files. The initialize method of the superclass is also called.

$obj->getSynsetDepth ($offset, $pos)

Returns the depth(s) of the synset denoted by $offset and $pos. The return value is a list of references to arrays. Each array has the form (depth, root).

$obj->getTaxonomyDepth ($offset, $pos)

Returns the maximum depth of the taxonomy rooted at the synset identified by $offset and $pos. If $offset and $pos does not identify a root of a taxonomy, then undef is returned and an error is raised.

$obj->getTaxonomies ($offset, $pos)

Returns a list of the roots of the taxonomies to which the synset identified by $offset and $pos belongs.

getLCSbyDepth($synset1, $synset2, $pos, $mode)

Given two input synsets, finds the least common subsumer (LCS) of them. If there are multiple candidates for the LCS (due to multiple inheritance in WordNet), the LCS with the greatest depth is chosen (i.e., the candidate whose shortest path to the root is the longest).

Parameters: a blessed reference, two synsets, a part of speech, and a mode. The mode must the either the string 'wps' or 'offset'. If the mode is wps, then the two input synsets must be in word#pos#sense format. If the mode is offset, then the input synsets must be WordNet offsets.

Returns: a list of the form ($lcs, $depth) where $lcs is the LCS (in wps format if mode is 'wps' or an offset if mode is 'offset'. $depth is the depth of the LCS in its taxonomy. Returns undef on error.

$obj->_processSynsetsFile ($filename)

Reads and processes a synsets file as output by wnDepths.pl

$obj->_processTaxonomyFile ($filename)

Reads and processes a taxonomies file as produced by wnDepths.pl


 Ted Pedersen, University of Minnesota Duluth
 tpederse at d.umn.edu

 Jason Michelizzi, University of Minnesota Duluth
 mich0212 at d.umn.edu



To report bugs, e-mail tpederse at d.umn.edu or go to http://groups.yahoo.com/group/wn-similarity/.


Copyright (c) 2005, Ted Pedersen and Jason Michelizzi

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to

 The Free Software Foundation, Inc.,
 59 Temple Place - Suite 330,
 Boston, MA  02111-1307, USA.

Note: a copy of the GNU General Public License is available on the web at http://www.gnu.org/licenses/gpl.txt and is included in this distribution as GPL.txt.

