The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DiaColloDB::Relation - diachronic collocation db, relation API (abstract & utilities)

SYNOPSIS

 ##========================================================================
 ## PRELIMINARIES
 
 use DiaColloDB::Relation;
 
 ##========================================================================
 ## Constructors etc.
 
 $rel = $CLASS_OR_OBJECT->new(%args);
 
 ##========================================================================
 ## Relation API: create
 
 $rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);
 
 ##========================================================================
 ## Relation API: union
 
 $rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
 
 ##========================================================================
 ## Relation API: profile
 
 $mprf = $rel->profile($coldb, %opts);
 
 ##========================================================================
 ## Relation API: comparison (diff)
 
 $mpdiff = $rel->compare($coldb, %opts);
 $mpdiff = $rel->diff($coldb, %opts);
 
 ##========================================================================
 ## Relation API: default: subprofile()
 
 $prf = $rel->subprofile(\@xids, %opts);
 
 ##========================================================================
 ## Relation API: default: qinfo()
 
 \%qinfo = $rel->qinfo($coldb, %opts);
 (\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);

DESCRIPTION

DiaColloDB::Relation is a base class for low-level indices capable of returning raw frequency data suitable for constructing DiaColloDB::Profile::Multi objects. In addition to the API specification, the DiaColloDB::Relation package also provides several common utility methods used by native DiaColloDB index types.

Globals & Constants

Variable: @ISA

DiaColloDB::Relation inherits from DiaColloDB::Persistent.

Constructors etc.

new
 $rel = CLASS_OR_OBJECT->new(%args);

%args, object structure: nothing here, see subclass documentation for details.

Relation API: create

create
 $rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);

populates relation database from $tokdat_file, a tt-style text file containing 1 token-id perl line with optional blank lines. %opts: clobber %$rel

Relation API: union

union
 $rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
  • merge multiple co-frequency indices into new object

  • @pairs : array of pairs ([$argrel,\@xi2u],...) of relation-objects $argrel and tuple-id maps \@xi2u for $rel

  • %opts: clobber %$rel

  • implicitly flushes the new index

Relation API: profile

profile
 $mprf = $rel->profile($coldb, %opts);

Get a relation-specific profile for selected items as a DiaColloDB::Profile::Multi object; called by DiaColloDB::profile().

%opts:

 ##-- selection parameters
 query => $query,           ##-- target request ATTR:REQ...
 date  => $date1,           ##-- string or array or range "MIN-MAX" (inclusive) : default=all
 ##
 ##-- aggregation parameters
 slice   => $slice,         ##-- date slice (default=1, 0 for global profile)
 groupby => $groupby,       ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method
 ##
 ##-- scoring and trimming parameters
 eps     => $eps,           ##-- smoothing constant (default=0)
 score   => $func,          ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f"
 kbest   => $k,             ##-- return only $k best collocates per date (slice) : default=-1:all
 cutoff  => $cutoff,        ##-- minimum score
 global  => $bool,          ##-- trim profiles globally (vs. locally for each date-slice?) (default=0)
 ##
 ##-- profiling and debugging parameters
 strings => $bool,          ##-- do/don't stringify (default=do)
 fill    => $bool,          ##-- if true, returned multi-profile will have null profiles inserted for missing slices

The default implementation calls $rel->subprofile() for every requested date-slice and collects the result in a DiaColloDB::Profile::Multi object.

Default values for %opts should be set by a higher-level call, e.g. DiaColloDB::profile().

Relation API: comparison (diff)

compare
 $mpdiff = $rel->compare($coldb, %opts);

Get a relation-specific comparison profile for selected items as a DiaColloDB::Profile::MultiDiff object.

%opts:

 ##-- selection parameters
 (a|b)?query => $query,       ##-- target query as for parseRequest()
 (a|b)?date  => $date1,       ##-- string or array or range "MIN-MAX" (inclusive) : default=all
 ##
 ##-- aggregation parameters
 groupby      => $groupby,    ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method
 (a|b)?slice  => $slice,      ##-- date slice (default=1, 0 for global profile)
 ##
 ##-- scoring and trimming parameters
 eps     => $eps,           ##-- smoothing constant (default=0)
 score   => $func,          ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f"
 kbest   => $k,             ##-- return only $k best collocates per date (slice) : default=-1:all
 cutoff  => $cutoff,        ##-- minimum score
 global  => $bool,          ##-- trim profiles globally (vs. locally for each date-slice?) (default=0)
 diff    => $diff,          ##-- low-level score-diff operation (diff|adiff|sum|min|max|avg|havg); default='adiff'
 ##
 ##-- profiling and debugging parameters
 strings => $bool,          ##-- do/don't stringify (default=do)
 ##
 ##-- sublcass abstraction parameters
 _gbparse => $bool,         ##-- if true (default), 'groupby' clause will be parsed only once, using $coldb->groupby() method
 _abkeys  => \@abkeys,      ##-- additional key-suffixes KEY s.t. (KEY=>VAL) gets passed to profile() calls if e.g. (aKEY=>VAL) is in %opts

The default implementation just wraps the profile() method; default values for %opts should be set by higher-level call, e.g. DiaColloDB::compare().

diff
 $mpdiff = $rel->diff($coldb, %opts);

alias for compare()

Relation API: default: subprofile()

subprofile
 $prf = $rel->subprofile(\@xids, %opts)

Native index API low-level profiling function; default implementation just throws an error.

Relation API: default: qinfo()

qinfo
 \%qinfo = $rel->qinfo($coldb, %opts);

get query-info hash for profile administrivia (ddc kwic links). %opts: as for profile(), additionally:

 qreqs => \@areqs,      ##-- as returned by $coldb->parseRequest($opts{query})
 gbreq => \%groupby,    ##-- as returned by $coldb->groupby($opts{groupby})
qinfoData
 (\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);

parses @opts{qw(qreqs gbreq)} into conditions on w1, w2 and metadata filters (for ddc linkup). call this from subclass qinfo() methods.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dcdb-create.per(1), dcdb-query.perl(1), dcdb-info.perl(1), dcdb-export.perl(1), dcdb-dump.perl(1), DiaColloDB(3pm), perl(1), ...