sval2plain.pl - Convert a Senseval-2 data file into plain text format
sval2plain.pl [OPTIONS] SVAL2
Note that there are 255 instances (contexts) in the Senseval-2 formatted input file.
frequency.pl begin.v-test.xml
OUTPUT =>
<sense id="begin%2:30:00::" percent="64.31"/> <sense id="begin%2:30:01::" percent="14.51"/> <sense id="begin%2:42:04::" percent="21.18"/> Total Instances = 255 Total Distinct Senses=3 Distribution={64.31,21.18,14.51} % of Majority Sense = 64.31
After converting to plain text, note that there are 255 lines in that file, one per context.
sval2plain.pl begin.v-test.xml > begin.v-test.txt wc begin.v-test.txt
255 15049 92598 begin.v-test.txt
You can find begin.v-test.xml in samples/Data
You can type sval2plain.pl --help for a quick summary of options
sval2plain.pl --help
Converts a given file from Senseval-2 format into plain text format. Each line of the plain text files contains a single context. This is useful when you have Senseval-2 data that you would like to use as feature extraction (training) data, which much be in plain text format.
Input file in Senseval-2 format that is to be converted into plain text format.
Displays the summary of command line options.
Displays the version information.
sval2plain displays the given SVAL2 file in plain text format with the contextual data of each instance on a separate line. Specifically, each i'th line displayed on STDOUT shows the context of the i'th instance in the given SVAL2 file.
Ted Pedersen, University of Minnesota, Duluth tpederse at d.umn.edu Amruta Purandare, University of Pittsburgh
Copyright (c) 2002-2008, Ted Pedersen and Amruta Purandare
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to
The Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
To install Text::SenseClusters, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Text::SenseClusters
CPAN shell
perl -MCPAN -e shell install Text::SenseClusters
For more information on module installation, please visit the detailed CPAN module installation guide.