The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Log::Parallel - main driver for the batch log processing system

SYNOPSIS

 use Log::Parallel;
 use Log::Parallel::ConfigCheck qw(validate_config);
 use Proc::JobQueue::DependencyQueue;

 opitons();

 run($opts);

 validate_config($config);

 add_recnums($config);

 $files_by_recnum{$_->{recnum}} = get_files_by_srec($_, $config->{hostsinfo}) 
        for @{$config->{sources}};

 my $dependency_graph = make_dependency_graph(make_task_list($opts, $config, %files_by_recnum))

 my $job_queue = new Proc::JobQueue::DependencyQueue(dependency_graph => $dependency_graph, hosts => [], hold_all => 1);

 setup_slave_hosts($config, $job_queue);

DESCRIPTION

This is the main driver module at the heart of the batch log processing system. It sets things up, figures out what jobs can run and in what order, and queues them up to run.

Everything it does is driven from the configuration data, probably parsed by Config::YAMLMacros and validated by Log::Parallel::ConfigCheck.

Only one program, process_logs, is expected to use this module. As such, documenting it's API is left as an exercise to the reader. Use the source.

SEE ALSO

This is used by process_logs. It reads configurations from Log::Parallel::ConfigCheck. It uses a Proc::JobQueue::DependencyQueue to queue the jobs that need to run. The jobs it runs are farmed out to remote systems using RPC::ToWorker. On the remote system, that code that runs the jobs is Log::Parallel::Task. The inputs to the jobs are parsed using a parser found by Log::Parallel::Parsers and the outputs are written using a writer invoked by Log::Parallel::Writers. The main writer is Log::Parallel::TSV. The time time formats that describe when jobs should run are parsed by Log::Parallel::Durations. This module has support modules: Log::Parallel::Paths, Log::Parallel::Metadata, Log::Parallel::Misc.

Some modules that are handy for writing jobs are: Log::Parallel::Sql, Stream::Aggregate, Log::Parallel::Geo::IP.

LICENSE

This package may be used and redistributed under the terms of either the Artistic 2.0 or LGPL 2.1 license.