Proc::JobQueue - job queue with dependencies, base class
use Proc::JobQueue; $queue = Proc::JobQueue->new(%parameters); $queue->addhost($host, %parameters); $queue->add($job); $queue->add($job, $host); $queue->startmore(); $queue->hold($new_value); $queue->checkjobs(); $queue->jobdone($job, $do_startmore, @exit_code); $queue->alldone() $queue->status() $queue->startjob($host, $jobnum, $job);
Generic queue of "jobs". Most likely to be subclassed for different situations. Jobs are registered. Hosts are registered. Jobs may or may not be tied to particular hosts. Jobs are started on hosts. Jobs may or may not have dependencies on each other.
Proc::JobQueue does not start jobs on its own: it needs something to call startmore() every now and then. Two subsclasses provide this complete Proc::JobQueue: Proc::JobQueue::EventQueue which provides an event-based framework using IO::Event and Proc::JobQueue::BackgroundQueue which provides a simple loop-until-all-the-jobs-are-done construct.
startmore()
From the jobs point of view, it will be started with:
$job->jobnum($jobnum); $jobnum = $job->jobnum(); $job->queue($queue); $job->host($host); $job->start();
When jobs complete, they must call:
$queue->jobdone($job, $do_startmore, @exit_code);
Jobs are run on hosts which must be added with:
$queue->addhost($hostname, jobs_per_host => $number_to_run_on_this_host_at_one_time)
Jobs can be shell commands (Proc::JobQueue::Command), a sequence of other jobs (Proc::JobQueue::Sequence), some standard file operations (Proc::JobQueue::Move, Proc::JobQueue::Sort), custom cubclasses of the base job class (Proc::JobQueue::Job), arbitrary perl code (Proc::JobQueue::DependencyJob, Proc::JobQueue::Task), or arbitary perl code pushed to a remote system to run (Proc::JobQueue::RemoteDependencyJob).
The parameters for new are:
new
Default number of jobs to run on each host simultaneously. This can be overridden on a per-host basis.
If any one host has more than this many jobs waiting for it, no can-run-on-any-host jobs will be started. This is to prevent the queue for this one overloaded host from getting too large.
This is the starting job number. Job numbers are sometimes displayed. They increment for each new job.
If true, prevent any jobs from starting until $queue->hold(0) is called.
$queue->hold(0)
A dependency graph to track jobs and tasks that have dependencies and are not yet ready to run because of their dependencies.
Adjusts the same parameters that can be set with new.
Register a new host. Parameters are:
The number of jobs that can be run at once on this host. This defaults to the jobs_per_host parameter of the $queue.
jobs_per_host
$queue
Add a job object to the runnable queue. The job object must be a Proc::JobQueue::Job or subclass of Proc::JobQueue::Job. The $host parameter is optional: if not set, the job can be run on any host.
$host
The $job object is started with:
$job
When the job complets, it must call:
Jobs added this way must be ready to run with no dependencies on other jobs. Jobs and tasks that have dependencies should be added with:
$queue->graph->add($job);
Get or set the dependency graph used to track jobs and tasks that have dependencies. The dependency graph is an Object::Dependency object (or at least something that implements the same API). Items in the dependency graph are not in the runnable queue. They will be moved to the runnable queue when they do not have any un-met dependencies.
When jobs complete, they must call jobdone. If $do_startmore is true, then startmore() will be called. A true exit code signals an error and it is used by Proc::JobQueue::CommandQueue.
$do_startmore
This marks the $job as complete and a new job can start in its place. For Proc::JobQueue::DependencyJob jobs, this leaves the dependency in place.
This checks the job queue. It returns true if all jobs have completed and the queue is empty.
This prints a queue status to STDERR showing what's running on which hosts. Printing is supressed unless $Proc::JobQueue::status_frequency seconds have passed since the last call to status().
$Proc::JobQueue::status_frequency
status()
This will start more jobs if possible. The return value is true if there are no more jobs to start.
Get (or set if $new_value is defined) the queue's hold-all-jobs parameter. If hold-all-jobs is true, no jobs will be started or pulled out of the dependency graph (if there is one).
These methods may be needed by subclassers or anyone poking around the internals:
Check Proc::Background style jobs to see if any have finished.
This starts a single job. It is used by startmore() and probably should not be used otherwise.
Called to shut down. Used by Proc::JobQueue::EventQueue.
Proc::JobQueue needs canonical hostnames. It gets them by default with Proc::JobQueue::CanonicalHostnames. You can override this default by overriding $Proc::JobQueue::host_canonicalizer with the name of a perl module to use instead of Proc::JobQueue::CanonicalHostnames.
$Proc::JobQueue::host_canonicalizer
Helper functions are provided by Proc::JobQueue and are available via explicit import:
use Proc::JobQueue qw(my_hostname canonicalize is_remote_host);
Proc::JobQueue::Job Proc::JobQueue::Command Proc::JobQueue::DependencyJob Proc::JobQueue::RemoteDependencyJob Proc::JobQueue::EventQueue Proc::JobQueue::BackgroundQueue
Copyright (C) 2007-2008 SearchMe, Inc. Copyright (C) 2008-2010 David Sharnoff. Copyright (C) 2011 Google, Inc. This package may be used and redistributed under the terms of either the Artistic 2.0 or LGPL 2.1 license.
To install Proc::JobQueue, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Proc::JobQueue
CPAN shell
perl -MCPAN -e shell install Proc::JobQueue
For more information on module installation, please visit the detailed CPAN module installation guide.