CHANGES - Revision history for WordNet::SenseRelate::AllWords

        Date : May 27, 2009

        1)  Added --backoff option to This option backs off to
            WordNet sense1 if the measure can't assign any sense.

        2) used querySense and reformatted the original
            text that contained the first word in the synset instead of the
            original word. It no more uses querySense and the formatted text
            contains the original semcor lemmas.

        3)  Added scripts utils/,
            utils/ and

            utils/ extracts plain text from a
            semcor formatted file. The text contains function words, content
            words as well as punctuation marks. This text is used for
            part-of-speech tagging.

            utils/ extracts content words
            given an answer file (typically a plain text file extracted
            using which has been tagged using a
            part of speech tagger) and a key file extracted using
   --key option.

            utils/ takes PENN treeb bank tagged text
            (format word PENNPOS per line) and converts it to WordNet tagged

        4)  Added default config files
            web/cgi-bin/allwords/user_data/lesk-stoplist.conf and
            web/cgi-bin/allwords/user_data/vector-stoplist.conf to the web
            folder. This is important because if the default config
            stoplists are not used for lesk and vector , they end up giving
            unexpected results.

        5)  Changed to format the string only if it is in
            word#pos#sense format.

        Date : April 30, 2009

        1)  Added following options to

            --usemono enable assigning the only available sense to monosemy

            This is a change in, where we assign the available
            sense to monosemy words. This results in increased precision and
            overall fmeasure because many measures can not get these words
            right as they can't find relatedness with the surrounding words.

        2) Added following options to

            1. --word score only this specific instance
            2. --exceptword score all instances except for these instances
            3. --score poly score only polysemes instances
            4. --score s1nc score only the instances where the most frequent
            sense is not correct
            5. --score n score only the instances having n number of senses

            These options are added in order to carry different experiments.

        3)  Fixed the script This is a perl
            script that gives corpus statistics given a semcor-reformatted
            corpus file.

        Date : April 2, 2009

        1)  Fixed for window=2. If it is the first word in a
            sentence and the window size is 2, we now consider a word at the
            right of the target word. Otherwise it will not be assigned any
            sense. In the previous version we were simply doing sense1 for
            such scenario.

        2)  Added a script to utils/ which gives
            statistics of the reformatted corpus.

        Date : March 16, 2009

        1)  Fixed and the web interface output for multiword
            compounds ( for example, by_and_large).

        2)  Punctuation fix for the raw format. If the format is raw and if
            the context contains words i.e. or el_al., then the period is
            not removed.

        3)  If format is raw and the context contains user identified
            compound words (for example St._Petersburg), then the
            punctuation is not removed. Please refer README for details.

        4)  Mapped Penn Treebank IN tag to WordNet 'r'(adverb) part of
            speech tag because most of the words in this class have adverb
            sense in WordNet.

        Date : February 14, 2009

        1)  Fixed the links of the default stoplist and nocompoundify option
            on the web interface.

        Date : February 13, 2009

        1)  Added script for scoring. This script is
            modeled after score2 C program
            ( scorer2 C program
            is no more used for scoring.

        2)  Changed and the web interface output format. The output
            now displays the original text. For example, if the input text
            is 'parking_ticket#n are#v expensive#a' then the output will be
            'parking_tickets#n#1 are#v#1 expensive#a#1'. Modified t/wsd.t
            for this change.

        3)  Removed config file option from the web interface

        4)  Added --nocompoundify option in and the web interface

        5)  Modified the default stoplist used by the web interface
            (web/cgi-bin/allwords/user_data/default-stoplist-raw.txt) and
            replicated it to samples/default-stoplist-raw.txt to be used by

        6)  Fixed the ForcePos option on the webinterface.

        7)  Fixed the web interface crash when the user didn't provide any

        8)  Modified to skip the entries where no sense is

        9)  --scorer2 option is no more provided in and
            the script calls utils/

        Date : January 22, 2009

        1)  Setup for experiments. Added 2 scripts
            and sorts the output of
   so that it can be used as input to the
            Senseval scorer2 program. runs wsd

        2)  Documentation changes in,,

        3)  Makefile changes to install newly added perl scripts.

        Date : November 03, 2008

        1)  Fixed client server model for the web interface. The allwords
            server can run on a different machine, from where your web
            server is running.

        2)  User data is now stored on the client machine so that it is
            easily accessible to the user.

        3)  Added a mandatory --logfile parameter alomg with other optional
            parameters to

        4)  Removed --outfile option from and added --glosses options
            to display glosses

        5)  Moved default-stoplist-raw.txt from
            WordNet-SenseRelate-AllWords/web/cgi-bin/allwords to

        Date : June 17, 2008

        1)  Removed compounding logic from Compounding is now
            done using WordNet::Tools compoundify. The clients, and
   don't call compoundify any more. They creat
            WordNet::Tools object and pass it as a parameter while creating
            AllWords object.

        2)  Moved sentence splitter from to a util program,
   All input formats will now assume that input
            is one sentence per line, one line per sentence.

        3)  Removed parsed format test case from t/wsd.t

        4)  Context was considered as a single line if more than one line
            was entered in the textarea on the web interface. Now the input
            is considered as one sentence per line, one line per sentence.

        Date : May 29, 2008

        1)  Added proper error messages for the words which are ignored
            during disambiguation. A suffix is attached to the word in order
            to indicate why it is ignored for disambiguation. For example,
            the word 'the' is not defined in WordNet and hence is ignored.
            We attach the#ND to indicate that this word is not defined in

        2)  Added context file and config file option to the Web interface

        3)  Removed --compounds option from as we decided to use
            centralized compoundify module WordNet::Tools

        4)  Added compoundify in This is done using the centralized
            WordNet::Tools module

        5)  Removed WordNet::SenseRelate dependency of the Web interface.
            Now compoundifying is done using WordNet::Tools

        Date : April 10, 2008 (all changes by TDP)

        1)  Modified test cases further to check for WordNet version using
            WordNet::Tools. Changes introduced in 0.08 testing were based on
            WordNet::QueryData's deprecated version test. Now skip all tests
            if we don't find version 2.0, 2.1, or 3.0, and introduce version
            specific tests within those as needed (if results change from
            one version of WordNet to the next).

        2)  Added reference to Jason Michelizzi's MS thesis, which gives
            many details about the AllWords approach, plus experimental

        Date : March 20, 2008 (all changes by TDP)

        1)  Fixed bug in that caused disambiguate to fail when
            window not specified. we now default to window size 3.

        2)  Fixed synopsis in to provide a cut and paste program
            that should run

        3)  Modified Makefile.PL to allow for use of 'make dist' and create

        4)  Changed release procedure, now using 'make dist' in order to
            avoid CVS export, and also create a directory that unpacks with
            a version number.

        Date : March 17, 2008

        1)  Added web interface support for AllWords (see /web for details)

        2)  Updated version information in to 0.07 (vk)

        3)  changed Changes file to CHANGES.pod (tdp)

        4)  added INSTALL.pod to /web (tdp)

        5)  updated version, copyright, and author info in README.pod (tdp)

        Thu May 19 14:06:10 2005 (all changes by JM)

        1)  added fixed mode

        Mon May 02 16:12:36 2005 (all changes by JM)

        1)  changed definition of context window to be total number of words

        2)  cleaned up errors in documentation

        3)  renamed as

        4)  added command-line help to and
   and expanded documentation

        5)  changed the compound-finding function so that only collocations
            of length MAX_COMPOUND_LENGTH or less are considered.
            MAX_COMPOUND_LENGTH is a constant defined in

        6)  added new wntagged format

        Tue Apr 12 10:40:48 2005 (all changes by JM)

        1)  fixed serious bug that often prevented higher numbered senses of
            target words from being considered

        2)  fixed errors in when --format is omitted

        3)  added diagnostic messages when stoplist is malformed

        4)  fixed bug in windowing that prevented window from expanding
            under certain circumstances

        5)  added new traces levels for displaying semantic relatedness
            scores and making ouput of zero values optional

        6)  fixed bug where sense1 and random schemes would fail when used
            with a stoplist or tagged text

        7)  clarified description of window in documentation

        8)  added sample stoplist

        9)  suppress irrelevant configuration information when is run
            under sense1 or random

        10) updated test scripts to reflect recent changes

        11) renamed as WordNet::SenseRelate::AllWords

        Fri Mar 11 15:25:18 2005 (all changes by JM)

        1)  added scripts for converting semcor files and formatting the
            output for Senseval

        2)  added another test script

        3)  changed the input format(s) to

        4)  expanded documentation

        Mon Jan 17 10:01:00 2005 (all changes by JM)

        1)  added part of speech coercion option

        2)  expanded discussions in README

        Wed Nov 3 12:52:33 2004 (all changes by JM)

        1)  original version; created by h2xs 1.23 with options -n
            WordNet::SenseRelate -X -b 5.6.0

     Varada Kolhatkar, University of Minnesota, Duluth
     kolha002 at

     Ted Pedersen, University of Minnesota, Duluth
     tpederse at

    This document last modified by : $Id: CHANGES.pod,v 1.23 2009/05/27
    21:18:12 kvarada Exp $


Copyright (c) 2008, Varada Kolhatkar, Ted Pedersen, Jason Michelizzi
    Permission is granted to copy, distribute and/or modify this document
    under the terms of the GNU Free Documentation License, Version 1.2 or
    any later version published by the Free Software Foundation; with no
    Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.

    Note: a copy of the GNU Free Documentation License is available on the
    web at <> and is included in this
    distribution as FDL.txt.