Ted Pedersen
and 1 contributors

Documentation

  • CHANGES - Revision history of SenseClusters
  • FAQ - Frequently Asked Questions about SenseClusters
  • INSTALL
  • README.Flowcharts - Word and Context Clustering Flowcharts
  • README.Testing - Explains structure and use of /Testing
  • README.Toolkit - SenseClusters Toolkit directory structure with links to all program documentation
  • README.Web - [Web Interface] How to install SenseClusters Web interface
  • README.Web.SC-cgi - [Web Interface] Description of cgi files used in SenseClusters web interface
  • README.Web.SC-htdocs - [Web Interface] Description of the htdocs directory in the Web interface
  • README.Web.user_data - [Web Interface] Description of user_data directory in Web interface
  • README.samples - How to run SenseCluster sample scripts
  • TODO - List of things TODO for SenseClusters
  • balance.pl - Create a balanced Senseval-2 data file that has the same number of instances for each possible sense.
  • bitsimat.pl - Build a similarity matrix from binary context vectors
  • callwrap.pl - [Web Interface] Check user input to Web interface and create discriminate.pl command to run on Web server
  • clusterlabeling.pl - Label discovered clusters based on their content
  • clusterstopping.pl - Predict the optimal number of clusters in a data set
  • cluto2label.pl - Convert Cluto output to a confusion matrix
  • create_gp.pl - [Web Interface] Creates gnuplot file (*.gp file) for Web user
  • create_plots.pl - [Web Interface] Create gnuplot output for Web interface user
  • create_tex_file.pl - [Web Interface] Create .tex file output for Web interface user
  • discriminate.pl
  • filter.pl - Remove the instances of low frequency sense tags from a Senseval-2 data file
  • format_clusters.pl - Map Cluto output to Senseval-2 format input file
  • frequency.pl - Compute the distribution of senses in a Senseval-2 data file
  • keyconvert.pl - Convert Senseval-2 answer key to Senseclusters format
  • label.pl - Assign labels to clusters in a confusion matrix to maximize agreement
  • maketarget.pl - Create target.regex file for a given Senseval-2 data file that shows all the forms of the target word
  • mat2harbo.pl - Convert matrix in Senseclusters sparse format to Harwell-Boeing (HB) format and set input parameters (lap2) for input to SVDPACKC.
  • nsp2regex.pl - Convert Text-NSP output into regular expressions to be used for feature matching
  • order1vec.pl - Convert Senseval-2 format contexts into first order feature vectors in Cluto format
  • order2vec.pl - Convert Senseval-2 contexts into second order context vectors in Cluto format
  • pod-template.pl - Skeleton for creating new SenseClusters programs
  • prepare_sval2.pl - Makes sure Senseval-2 data is cleaned and has sense tags prior to invocation of SenseClusters
  • preprocess.pl - Split Senseval-2 data file into one file per lexical item (lexelt), and carry out various tokenization and formatting tasks
  • reduce-count.pl - Reduce size of feature space by removing words not in evaluation data
  • report.pl - Summarize SenseClusters results with precision, recall, and confusion matrix
  • setup.pl - Preprocess Senseval-2 data for sample experiments
  • simat.pl - Build a similarity matrix from real-valued context vectors
  • sval2plain.pl - Convert a Senseval-2 data file into plain text format
  • svdcompare.pl - Provide a "fuzzy" diff command for comparing svd output to our key
  • svdpackout.pl - Reconstruct post-SVD form of matrix from singular values output by SVDPACKC
  • testXML.pl - [Web Interface] Check XML data to see if well-formed
  • text2sval.pl - Convert a plain text file with one context per line into Senseval-2 format
  • windower.pl - Limit window of context around a target word specified in a Senseval-2 input file
  • wordvec.pl - Construct word vectors from bigram or co-occurrence matrices

Modules

  • Text::SenseClusters - Cluster similar contexts using co-occurrence matrices and Latent Semantic Analysis