Parallel::Batch - Run a large number of similar processes using bounded parallelism
use Parallel::Batch; my $batch = Parallel::Batch->new({code => \&frobnicate, jobs => [ ... ], maxprocs => 8}); $batch->run();
Parallel::Batch solves a common problem allowing modern multi-CPU computers to be used efficiently: you have a large number of independent pieces of data that all need to be processed somehow, and can run several of these processes at the same time.
There are a few trivial ways to execute a large number of jobs. You could run the entire set serially, but this will not use all the available processing speed. You could also create n processes at once to run all jobs simultaneously, but this tends to quickly exhast other resources like memory and I/O bandwidth, making the entire process slower. Or you could divide the set into m equally-sized groups and have each processor run its subset serially, but this will usually waste time at the end if some jobs take longer than others to finish.
This module works by calling fork() to create a new process, invoking a user-specified function on the next piece of data within this process, and returning once all data has been thusly processed and all processes exited. It also keeps track of the total number of jobs in progress, and will keep this under a set limit by delaying new forks until existing processes terminate.
fork()
Options:
The following options can be passed to the constructor in a hashref, or retrieved or changed later using their own accessor methods
coderef to be run on each piece of data. It will be passed a single argument, which is an element of the jobs array.
jobs
Array of data objects to be processed.
Maximum number of child processes that should be running at any time.
Hashref of progress callbacks
Start running the jobs, and return once all are completed.
Parallel::Batch can report its progress through applicaton-defined callbacks as it runs. If the progress_cb argument is a hashref containing any of the following keys, they will be called at the places descibed:
progress_cb
Will be called just before any processes are spawned.
Will be called after each new process has been created.
Will be called when a child process exits.
Will be called after all jobs are completed and all child processes have terminated.
Mention other useful documentation such as the documentation of related modules or operating system documentation (such as man pages in UNIX), or any relevant external documentation such as RFCs or standards.
If you have a mailing list set up for your module, mention it here.
If you have a web site set up for your module, mention it here.
Stephen Cavilia, <sac@atomicradi.us<gt>
Copyright (C) 2011 by Stephen Cavilia
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.2 or, at your option, any later version of Perl 5 you may have available.
To install Parallel::Batch, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parallel::Batch
CPAN shell
perl -MCPAN -e shell install Parallel::Batch
For more information on module installation, please visit the detailed CPAN module installation guide.