NAME
get_words - given collapsed treebank, print words only
SYNOPSIS
get_words [options] [file[s] or STDIN]
Options:
-help brief help message
-man full documentation
-sgml put <s> and </s> tokens around words
-nosgml
-parens put ( and ) tokens around words
-noparens
OPTIONS
- -help
-
Print a brief help message and exits.
- -man
-
Prints the manual page and exits.
- -sgml
- -nosgml
-
Writes <s> at the beginning of each line and </s> at the end of each line, or (in the case of
-nosgml
) don't.Default is
-sgml
. - -parens
- -noparens
-
Writes
(
at the beginning of each line and)
at the end of each line, or (in the case of-noparens
) don't.Default is
-noparens
.
DESCRIPTION
Reads input files (or STDIN) for Penn-style trees, one per line, and prints out only the words, one tree per line.
Providing the -sgml
tag makes the output pseudo-SGML by including angle-bracketed <s>
and </s>
tokens at the beginning and end of each line.