NAME
analyze - TF-IDF Analyze a corpus
SYNOPSIS
analyze --dir=/some/corpus [options]
Options:
--help help message
--man full documentation
--dir corpus of text documents
--size ngram size
--top top TF-IDF ngrams
--stop use stopwords
--phrase search phrase
--type file extension
Examples:
perl analyze --dir=/Users/you/Documents/lit/inaugural --top=5
perl analyze --dir=/Users/you/Documents/lit/inaugural --phrase='public good'
perl analyze --dir=/Users/you/Documents/lit/inaugural --dir=/Users/you/Documents/lit/SOTU --top=5
perl analyze --dir=/Users/you/Documents/lit/Shakespeare --size=3 --top=5
perl analyze --dir=/Users/you/perl5/perlbrew/perls/perl-5.27.7/lib/site_perl/5.27.7/Music --size=1 --type=pm
OPTIONS
- --help
-
Brief help message
- --man
-
Full manual page
- --dir
-
Required corpus list of text documents
- --size
-
Ngram phrase size - Default = 2
- --top
-
Show the top N ngrams seen. Default = 0
- --stop
-
Constrain the ngrams by excluding stopwords. Default = 1
- --phrase
-
Search the corpus for the phrase and its IF-IDF values. Default = ''
- --type
-
Read copus files of this file extension. Default = 'txt'
DESCRIPTION
This program analyzes the given corpus with the TF-IDF measure for ngrams.