DTA::CAB::Format::Raw::Waste - Datum parser: raw untokenized text (using moot/waste)
##======================================================================== ## PRELIMINARIES use DTA::CAB::Format::Raw::Waste; ##======================================================================== ## Constructors etc. $fmt = CLASS_OR_OBJ->new(%args); ##======================================================================== ## Methods: Persistence @keys = $class_or_obj->noSaveKeys(); ##======================================================================== ## Methods: Local: model caching \%wmodel_or_undef = $fmt->ensureModel(); \%config = CLASS_OR_OBJECT->loadModelConfig($wasterc); ##======================================================================== ## Methods: Model I/O $fmt_or_undef = $fmt->ensureLoaded(); $fmt_or_undef = $fmt->loadModel(); ##======================================================================== ## Methods: Input: Input selection $fmt = $fmt->close(); + default calls fromFh(); ##======================================================================== ## Methods: Input: Generic API $doc = $fmt->parseDocument(); ##======================================================================== ## Methods: Output: Generic $type = $fmt->mimeType(); $ext = $fmt->defaultExtension();
DTA::CAB::Format::Raw::Waste is an input DTA::CAB::Format subclass for untokenized raw string input using moot/WASTE as an underlying tokenizer. As an output format, inherits from DTA::CAB::Format::Raw::Base for output.
Inherits from DTA::CAB::Format::Raw::Base.
List of default paths to search for waste.rc config files; see mootfiles(5); default value:
($ENV{TOKWRAP_RCDIR} ? "$ENV{TOKWRAP_RCDIR}/waste/waste.rc" : qw()), (defined($DTA::TokWrap::Version::VERSION) ? "$DTA::TokWrap::Version::RCDIR/waste/waste.rc" : qw()), "$ENV{HOME}/.wasterc", "/etc/wasterc", "/etc/default/wasterc"
$fmt = CLASS_OR_OBJ->new(%args);
object structure: assumed HASH
{ ##-- Input doc => $doc, ##-- buffered input document wasterc => $rcFile, ##-- waste .rc file; default: "$HOME/.wasterc" || "/etc/wasterc" || "/etc/default/waste" ##-- Runtime wmodel => \%wmodel ##-- waste model; %wmodel=( # config => \%config, #-- parsed rcfile (see loadModelConfig()) # loaded => $time, #-- unix timestamp of last model load # wscanner => $scanner, #-- waste scanner # wlexer => $lexer, #-- waste lexer # wtagger => $tagger, #-- waste tagger # wdecoder => $decoder, #-- waste decoder # wannotator => $wannot, #-- waste annotator # wwriter => $wwriter, #-- native-format writer (hack) # ) ##-- logging (in order of increasing verbosity) logLoad => $level, # model loading log-level (default=$logLoad) logCache => $level, # cache operation log-level (default=$logCache) logRun => $level, # runtime operation log-level (default=$logRun) ##-- Common #utf8 => $bool, ##-- utf8 mode always on
@keys = $class_or_obj->noSaveKeys();
Returns list of keys not to be saved; override appends qw(doc wmodel wscanner wlexer wtagger wdecoder wannotator wwriter).
qw(doc wmodel wscanner wlexer wtagger wdecoder wannotator wwriter)
Cached models ("$wasterc_abspath:$PID" => \%wmodel)
"$wasterc_abspath:$PID" => \%wmodel
\%wmodel_or_undef = $fmt->ensureModel(); \%wmodel_or_undef = $fmt->ensureModel($wasterc) \%wmodel_or_undef = CLASS->ensureModel($wasterc)
Loads cached model if available; otherwise populates cache.
\%config = CLASS_OR_OBJECT->loadModelConfig($wasterc);
loads rc-file with keys qw(abbrevs conjunctions stopwords dehyphenate hmm)
qw(abbrevs conjunctions stopwords dehyphenate hmm)
$fmt_or_undef = $fmt->ensureLoaded();
ensures model is loaded.
$fmt_or_undef = $fmt->loadModel(); $fmt_or_undef = $fmt->loadModel($rcfile);
backwards-compatible method wraps ensureModel().
ensureModel()
$fmt = $fmt->close();
(undocumented)
$fmt = $fmt->fromFh($fh)
select input from a filehandle.
$doc = $fmt->parseDocument();
just returns $fmt->{doc}.
$type = $fmt->mimeType();
default returns text/plain
text/plain
$ext = $fmt->defaultExtension();
returns default filename extension for this format (.raw)
.raw
Bryan Jurish <moocow@cpan.org>
Copyright (C) 2011-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
dta-cab-analyze.perl(1), dta-cab-convert.perl(1), dta-cab-http-server.perl(1), dta-cab-http-client.perl(1), dta-cab-xmlrpc-server.perl(1), dta-cab-xmlrpc-client.perl(1), DTA::CAB::Server(3pm), DTA::CAB::Client(3pm), DTA::CAB::Format(3pm), DTA::CAB(3pm), perl(1), ...
To install DTA::CAB, copy and paste the appropriate command in to your terminal.
cpanm
cpanm DTA::CAB
CPAN shell
perl -MCPAN -e shell install DTA::CAB
For more information on module installation, please visit the detailed CPAN module installation guide.