The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

dbpipeline - allow db commands to be assembled as pipelines in Perl

SYNOPSIS

    use Fsdb::Filter::dbpipeline qw(:all);
    dbpipeline(
        dbrow(qw(name test1)),
        dbroweval('_test1 += 5;')
    );

Or for more customized versions, see "dbpipeline_filter", "dbpipeline_sink", "dbpipeline_open2", and "dbpipeline_close2_hash".

DESCRIPTION

This module makes it easy to create pipelines in Perl using separate processes. (In the past we used to use perl threads.)

By default (as with all Fsdb modules), input is from STDIN and output to STDOUT. Two helper functions, fromfile and tofile can grab data from files.

Dbpipeline differs in several ways from all other Fsdb::Filter modules: it does not have a corresponding Unix command (it is used only from within Perl). It does not log its presence to the output stream (this is arguably a bug, but it doesn't actually do anything).

OPTIONS

Unlike most Fsdb modules, dbpipeline defaults to --autorun.

This module also supports the standard fsdb options:

-d

Enable debugging output.

-i or --input InputSource

Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

-o or --output OutputDestination

Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

--autorun or --noautorun

By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.

--help

Show help.

--man

Show full manual.

SEE ALSO

Fsdb(3)

CLASS FUNCTIONS

dbpipeline

    dbpipeline(@modules);

This shorthand-routine creates a dbpipeline object and then immediately runs it.

Thus perl code becomes nearly as terse as shell code:

    dbpipeline(
        dbcol(qw(name test1)),
        dbroweval('_test1 += 5;'),
    );

The following commands currently have shorthand aliases:

cgi_to_db(1)
combined_log_format_to_db(1)
csv_to_db(1)
db_to_csv(1)
db_to_html_table(1)
dbcol(1)
dbcolcopylast(1)
dbcolcreate(1)
dbcoldefine(1)
dbcolhisto(1)
dbcolmerge(1)
dbcolmovingstats(1)
dbcolneaten(1)
dbcolpercentile(1)
dbcolrename(1)
dbcolscorrelate(1)
dbcolsplittocols(1)
dbcolsplittorows(1)
dbcolsregression(1)
dbcolstats(1)
dbcolstatscores(1)
dbfilealter(1)
dbfilecat(1)
dbfilediff(1)
dbfilepivot(1)
dbfilestripcomments(1)
dbfilevalidate(1)
dbformmail(1)
dbjoin(1)
dbmapreduce(1)
dbmerge(1)
dbmerge2(1)
dbmultistats(1)
dbrow(1)
dbrowaccumulate(1)
dbrowcount(1)
dbrowdiff(1)
dbroweval(1)
dbrowuniq(1)
dbrvstatdiff(1)
dbsort(1)
html_table_to_db(1)
kitrace_to_db(1)
mysql_to_db(1)
tabdelim_to_db(1)
tcpdump_to_db(1)
xml_to_db(1)

and

dbsubprocess(3)

dbpipeline_filter

    my($result_reader, $fred) = dbpipeline_filter($source, $result_reader_aref, @modules);

Set up a pipeline of @MODULES that filters data pushed through it, where the data comes from $SOURCE (any Fsdb::Filter::parse_io_option object, like a Fsdb::IO::Reader object, queue, or filename).

Returns a $RESULT_READER Fsdb::IO::Reader object, created with $RESULT_READER_AREF as options. This reader will produce the filtered data, and a $FRED that must be joined to guarantee output has completed.

Or if $RESULT_READER_AREF is [-raw_fh, 1], it just returns the IO::Handle to the pipe.

As an example, this code uses dbpipeline_filter to insure the input (from $in which is a filename or Fsdb::IO::Reader) is sorted numerically by column x:

    use Fsdb::Filter::dbpipeline qw(dbpipeline_filter dbsort);
    my($new_in, $new_fred) = dbpipeline_filter($in,
        [-comment_handler => $self->create_delay_comments_sub],
        dbsort(qw(--nolog -n x)));
    while (my $fref = $new_in->read_rowwobj()) {
        # do something
    };
    $new_in->close;
    $new_fred->join();

dbpipeline_sink

    my($fsdb_writer, $fred) = dbpipeline_sink($writer_arguments_aref, @modules);

Set up a pipeline of @MODULES that is a data "sink", where the output is given by a --output argument, or goes to standard output (by default). The caller generates input into the pipeline by writing to a newly created $FSDB_WRITER, whose configuration is specified by the mandatory first argument $WRITER_ARGUMENTS_AREF. (These arguments should include the schema.) Returns this writer, and a $FRED that must be joined to guarantee output has completed.

If the first argument to modules is "--fred_exit_sub", then the second is taken as a CODE block that runs at fred exit (and the two are not passed to modules).

If the first argument to modules is "--fred_description", then the second is taken as a text description of the Fred.

dbpipeline_open2

    my($fsdb_reader_fh, $fsdb_writer, $fred) = 
        dbpipeline_open2($writer_arguments_aref, @modules);

Set up a pipeline of @MODULES that is a data sink and source (both!). The caller generates input into the pipeline by writing to a newly created $FSDB_WRITER, whose configuration is specified by the mandatory argument $WRITER_ARGUMENTS_AREF. These arguments should include the schema.) The output of the pipeline comes out to the newly created $FSDB_READER_FH. Returns this read queue and writer, and a $PID that must be joined to guarantee output has completed.

(Unfortunately the interface is asymmetric with a read queue but a write Fsdb::IO object, because Fsdb::IO::Reader blocks on input of the header.)

Like IPC::Open2, with all of its pros and cons like potential deadlock.

dbpipeline_close2_hash

    my($href) = dbpipeline_close2_hash($fsdb_read_fh, $fsdb_writer, $pid);

Reads and returns one line of output from $FSDB_READER, after closing $FSDB_WRITER and joining the $PID.

Useful, for example, to get dbcolstats output cleanly.

new

    $filter = new Fsdb::Filter::dbpipeline(@arguments);

set_defaults

    $filter->set_defaults();

Internal: set up defaults.

parse_options

    $filter->parse_options(@ARGV);

Internal: parse options

setup

    $filter->_reap();

Internal: reap any forked threads.

setup

    $filter->setup();

Internal: setup, parse headers.

run

    $filter->run();

Internal: run over all IO

finish

    $filter->finish();

Internal: we would write a trailer, but we don't because we depend on the last command in the pipeline to do that. We don't actually have a valid output stream.

AUTHOR and COPYRIGHT

Copyright (C) 1991-2018 by John Heidemann <johnh@isi.edu>

This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.