NAME

dbmerge - merge all inputs in sorted order based on the the specified columns

SYNOPSIS

dbmerge --input A.fsdb --input B.fsdb [-T TemporaryDirectory] [-nNrR] column [column...]

or cat A.fsdb | dbmerge --input - --input B.fsdb [-T TemporaryDirectory] [-nNrR] column [column...]

or dbmerge [-T TemporaryDirectory] [-nNrR] column [column...] --inputs A.fsdb [B.fsdb ...]

DESCRIPTION

Merge all provided, pre-sorted input files, producing one sorted result. Inputs can both be specified with --input, or one can come from standard input and the other from --input.

Inputs must have identical schemas (columns, column order, and field separators).

Unlike dbmerge2, dbmerge supports an arbitrary number of input files.

Because this program is intended to merge multiple sources, it does not default to reading from standard input. If you wish to list - as an explicit input source.

Also, because we deal with multiple input files, this module doesn't output anything until it's run.

Dbmerge consumes a fixed amount of memory regardless of input size. It therefore buffers output on disk as necessary. (Merging is implemented a series of two-way merges, so disk space is O(number of records).)

OPTIONS

General option:

<--removeinputs>: Delete the source files after they have been consumed. (Defaults off, leaving the inputs in place.)
<-T TmpDir>: where to put tmp files. Also uses environment variable TMPDIR, if -T is not specified. Default is /tmp.

Sort specification options (can be interspersed with column names):

-r or --descending: sort in reverse order (high to low)
-R or --ascending: sort in normal order (low to high)
-n or --numeric: sort numerically
-N or --lexical: sort lexicographically

This module also supports the standard fsdb options:

-d: Enable debugging output.
-i or --input InputSource: Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
-o or --output OutputDestination: Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
--autorun or --noautorun: By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.
--help: Show help.
--man: Show full manual.

SAMPLE USAGE

Input:

File a.fsdb:

#fsdb cid cname
11 numanal
10 pascal

File b.fsdb:

#fsdb cid cname
12 os
13 statistics

These two files are both sorted by cname, and they have identical schemas.

Command:

dbmerge --input a.fsdb --input b.fsdb cname

cat a.fsdb | dbmerge --input b.fsdb cname

Output:

#fsdb      cid     cname
11 numanal
12 os
10 pascal
13 statistics
#  | dbmerge --input a.fsdb --input b.fsdb cname

CLASS FUNCTIONS

new

$filter = new Fsdb::Filter::dbmerge(@arguments);

Create a new object, taking command-line arugments.

set_defaults

$filter->set_defaults();

Internal: set up defaults.

parse_options

$filter->parse_options(@ARGV);

Internal: parse command-line arguments.

setup

$filter->setup();

Internal: setup, parse headers.

segment_next_output

$out = $self->segment_next_output($input_finished)

Internal: return a Fsdb::IO::Writer as $OUT that either points to our output or a temporary file, depending on how things are going.

segment_merge

$self->segment_merge();

Merge queued files, if any.

We process the work queue in a file-system-cache-friendly order, based on ideas from "Information and Control in Gray-box Systems" by the Arpaci-Dusseau's at SOSP 2001.

Idea: each "pass" through the queue, reverse the processing order so that the most recent data (that's hopefully in the file system cache in memory) is handled first.

This algorithm isn't perfect (sometimes if there's an odd number of files in the queue you reach way back in time, but most of the time it's quite good).

run

$filter->run();

Internal: run over each rows.

AUTHOR and COPYRIGHT

This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.

To install Fsdb, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Fsdb

CPAN shell

perl -MCPAN -e shell
install Fsdb

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)