The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parallel::ForkManager::Segmented - use Parallel::ForkManager on batches / segments of items.

VERSION

version 0.10.1

SYNOPSIS

    use Parallel::ForkManager::Segmented ();
    use Path::Tiny qw/ path /;

    my $NUM    = 30;
    my $temp_d = Path::Tiny->tempdir;

    my @queue = ( 1 .. $NUM );
    my $proc  = sub {
        my $fn = shift;
        $temp_d->child($fn)->spew_utf8("Wrote $fn .\n");
        return;
    };
    Parallel::ForkManager::Segmented->new->run(
        {
            WITH_PM      => 1,
            items        => \@queue,
            nproc        => 3,
            batch_size   => 8,
            process_item => $proc,
        }
    );

DESCRIPTION

This module builds upon Parallel::ForkManager allowing one to pass a batch (or "segment") of several items for processing inside a worker. This is done in order to hopefully reduce the forking/exiting overhead.

METHODS

my $obj = Parallel::ForkManager::Segmented->new;

Initializes a new object.

my \%ret = $obj->process_args(+{ %ARGS })

TBD.

$obj->run(+{ %ARGS });

Runs the processing. Accepts the following named arguments:

  • process_item

    A reference to a subroutine that accepts one item and processes it.

  • items

    A reference to the array of items.

  • stream_cb

    A reference to a callback for returning new batches of items (cannot be specified along with 'items'.)

    Accepts a hash ref with the key 'size' specifying an integer of the maximal item count.

    Returns a hash ref with the key 'items' pointing to an array reference of items or undef() upon end-of-stream.

    E.g:

            $stream_cb = sub {
                my ($args) = @_;
                my $size = $args->{size};
    
                return +{ items =>
                        scalar( @$items ? [ splice @$items, 0, $size ] : undef() ),
                };
            };

    Added at version 0.4.0.

  • nproc

    The number of child processes to use.

  • batch_size

    The number of items in each batch.

  • disable_fork

    Disable forking and use of Parallel::ForkManager and process the items serially.

  • process_batch

    [Added in v0.2.0.]

    A reference to a subroutine that accepts a reference to an array of a whole batch that is processed as a whole. If specified, process_item is not used.

    Example:

        use strict;
        use warnings;
        use Test::More tests => 30;
        use Parallel::ForkManager::Segmented ();
        use Path::Tiny qw/ path /;
    
        {
            my $NUM    = 30;
            my $temp_d = Path::Tiny->tempdir;
    
            my @queue = ( 1 .. $NUM );
            my $proc  = sub {
                foreach my $fn ( @{ shift(@_) } )
                {
                    $temp_d->child($fn)->spew_utf8("Wrote $fn .\n");
                }
                return;
            };
            Parallel::ForkManager::Segmented->new->run(
                {
                    WITH_PM       => 1,
                    items         => \@queue,
                    nproc         => 3,
                    batch_size    => 8,
                    process_batch => $proc,
                }
            );
            foreach my $i ( 1 .. $NUM )
            {
                # TEST*30
                is( $temp_d->child($i)->slurp_utf8, "Wrote $i .\n", "file $i", );
            }
        }

SEE ALSO

Parallel::ForkManager is the underlying module that this module is based on.

Parallel::Map::Segmented provides a mostly compatibly API with Parallel::ForkManager::Segmented only based on Parallel::Map (by MSTROUT) and IO::Async::Function (by PEVANS). IO::Async provides a less snowflake approach. Thanks guys!

Parallel::ForkManager::Segmented::Base is the base class of Parallel::Map::Segmented and Parallel::ForkManager::Segmented to avoid DRY ("Don't repeat yourself").

https://perl-begin.org/uses/multitasking/ is a page about multitasking in Perl rounding up the usual suspects.

SUPPORT

Websites

The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources.

Bugs / Feature Requests

Please report any bugs or feature requests by email to bug-parallel-forkmanager-segmented at rt.cpan.org, or through the web interface at https://rt.cpan.org/Public/Bug/Report.html?Queue=Parallel-ForkManager-Segmented. You will be automatically notified of any progress on the request by the system.

Source Code

The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :)

https://github.com/shlomif/perl-Parallel-ForkManager-Segmented

  git clone https://github.com/shlomif/perl-Parallel-ForkManager-Segmented.git

AUTHOR

Shlomi Fish <shlomif@cpan.org>

BUGS

Please report any bugs or feature requests on the bugtracker website https://github.com/shlomif/parallel-forkmanager-segmented/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

COPYRIGHT AND LICENSE

This software is Copyright (c) 2018 by Shlomi Fish.

This is free software, licensed under:

  The MIT (X11) License