Carsten Gaebler
and 1 contributors

NAME

Parallel::MPM::Prefork - A simple non-threaded, non-OO, pre-forking, self-regulating, multi-purpose multi-processing module. Period.

SYNOPSIS

  use Data::Dumper;
  use Parallel::MPM::Prefork;
  use sigtrap qw(die normal-signals);
  use Socket;

  pf_init(
    min_spare_servers => 2,
    max_spare_servers => 4,
    start_servers => 3,
    max_servers => 20,
    data_hook_in_main => 0,
    child_data_hook => sub {
      my ($pid, $data, $exitcode) = @_;
      print "$pid ", Dumper($data), "\n";
    },
    child_sigh => {
      HUP => sub { print "$$ ignoring SIGHUP\n" },
      'TERM INT' => sub { print "$$ exiting (SIG$_[0])\n"; exit },
    },
  ) or die $Parallel::MPM::Prefork::error;

  my $SOCK = mksock();
  $SIG{TERM} = $SIG{INT} = sub { pf_done(0) };

  # Variant 1: More convenient, less flexible.
  1 while pf_whip_kids(\&echo_server, [$SOCK]);

  # Variant 2: More flexible, less convenient.
  while (1) {
    my $pid = pf_kid_new() // die "Could not fork: $!";
    last if $pid < 0;
    next if $pid;
    echo_server($SOCK);
    pf_kid_exit(0, \'Bazinga!', 1);
  }

  END {
    pf_done();
  }

  # A simple echo server.
  sub echo_server {
    my $sock = shift;
    CONN: while (accept my $conn, $sock) {
      pf_kid_busy(); # tell parent we're busy
      /^quit/ ? last CONN : syswrite $conn, $_ while <$conn>;
      pf_kid_yell({ foo => 'bar' }, 1);  # send data to parent
      pf_kid_idle(); # tell parent we're idle again
    }
  }

  sub mksock {
    socket my $SOCK, AF_INET, SOCK_STREAM, 0;
    setsockopt($SOCK, SOL_SOCKET, SO_REUSEADDR, 1);
    bind $SOCK, pack_sockaddr_in(20116, inet_aton('127.0.0.1'));
    listen $SOCK, SOMAXCONN;
    $SOCK;
  }

DESCRIPTION

Parallel::MPM::Prefork is a pre-forking multi-processing module that adjusts the number of child processes dynamically depending on the current work load reported by the child processes. The child processes can send the main process (almost) any kind of data at any time.

FUNCTIONS

By default, all functions described below are exported.

pf_init ( %options )

Initialization. Creates a process group (see NOTES), sets up internal child communications channels, reaps potentially left-over child processes and installs a SIGCHLD handler. Does not fork any child processes.

Returns false on error. Accepts an optional hash of the following options:

max_servers

Maximum total number of child processes Default: 73.

max_spare_servers

Maximum number of idle child processes. Surplus idle child processes will receive a SIGTERM (and are supposed to obey it). Default: 10.

min_spare_servers

Minimum number of idle child processes Default: 5.

start_servers

Number of child processes initially created by pf_whip_kids() or pf_kid_new(). Default: 5.

child_sigh

Signal handlers to be installed in the child. Hash reference holding space separated signal names as keys and code references or the special strings 'DEFAULT' or 'IGNORE' as values, e.g: { HUP => $code, 'INT TERM' => 'DEFAULT' }. Default: undef.

Any SIGTERM handler should cause the child process to exit sooner or later.

child_data_hook

Code reference to be called when a child calls pf_kid_yell() or pf_kid_exit() with a $data argument. Receives child pid, data and exit code as arguments (in this order). The exit code is undef for pf_kid_yell().

If thaw() was requested (see pf_kid_yell() and pf_kid_exit()) and failed, $data is undef and $Parallel::MPM::Prefork::error contains the original data from Storable::nfreeze().

This hook is executed in a dedicated child process unless data_hook_in_main is set to true.

data_hook_in_main

Boolean value. If false (the default), a separate child process reads child data from pf_kid_yell() and pf_kid_exit() and executes child_data_hook. If true, this is done in the main process.

Note that if you set this to true, a long-running or heavily used child_data_hook will slow down the child process management of the main process. Putting it in a separate process only affects the performance of the child processes.

pf_whip_kids () and pf_kid_new ()

These two functions manage the child processes. Which one you use is up to your taste and use case. Either one must be called in a loop to keep the show running.

In either case, all signals are blocked during child process creation, the old signal mask is saved, and the signal handlers given by child_sigh are installed. The old signal mask is restored just before pf_kid_new() returns or pf_whip_kids() calls the code reference.

pf_whip_kids ( $code, $args )

Wraps child processing in a single call.

Returns 1 as soon as any child changes status, yells or exits. Immediately returns undef if a fork() failed or 0 if pf_done() has already been called;

Typical code:

  $SIG{TERM} = $SIG{INT} = sub { pf_done(0) };
  1 while pf_whip_kids(\&echo_server, [$SOCK]);

$code

Code reference to be called in the child processes. Must make sure it calls pf_kid_busy() and pf_kid_idle() as needed. If it returns, the child will exit via exit(0).

$args (optional)

Array reference holding arguments to be passed when $code is called ($code->(@$args)).

pf_kid_new ()

Forks a new child process if too few are idle (< min_spare_servers). Blocks otherwise and kills child processes if too many are idle (> max_spare_servers).

If a new child process was forked, returns the child pid to the parent, 0 to the child, undef if fork() failed.

As a special case it always returns -1 immediately if pf_done() has already been called.

The newly created child is considered idle by the parent. It should call pf_kid_busy() as soon as it starts working and pf_kid_idle() when it is available again so that the parent can arrange for enough available child processes.

Typical code:

  $SIG{TERM} = $SIG{INT} = sub { pf_done(0) };
  while (1) {
    my $pid = pf_kid_new() // die "Could not fork: $!";
    last if $pid < 0;  # pf_done()
    next if $pid;  # parent
    # child:
    pf_kid_busy();
    # do some rocket science
    pf_kid_idle();
    pf_kid_exit();
  }

  END {
    pf_done();
  }

pf_kid_busy ()

To be called by a child process to tell the main process it is busy.

pf_kid_idle ()

To be called by a child process to tell the main process it is idle.

pf_kid_exit ( $exitcode, $data, $thaw )

Calls pf_kid_yell($data, $thaw) and then exits from the child via exit($exitcode). $exitcode will be tuncated to an 8-bit unsigned integer, defaults to 0 if omitted. $data and $thaw are optional (see pf_kid_yell()).

pf_kid_yell ( $data, $thaw )

Sends data from a child to the main process which then calls child_data_hook($pid, undef, $data) with either the serialized or (if $thaw is true) deserialized data.

Returns true on success, undef otherwise. If $data could not be serialized, $Parallel::MPM::Prefork::error contains the error message from Storable::nfreeze():

This function is a no-op in the main process, if child_data_hook is not set or if $data is not a reference.

As all child processes share the same upstream socket to the parent you should probably not send more than POSIX::PIPE_BUF bytes in one go if your children are nerve-racking blare machines. Otherwise the data might be split up in smaller chunks and get intermixed with data from other child processes sending at the same time. While this could be avoided with some extra effort I prefer to keep it simple.

$data (required)

A reference to any Perl data type that Storable can serialize. Will be passed to child_data_hook in the main process as a Storable::nfreeze() string.

$thaw (optional)

A boolean value. If true, $data will be deserialized with Storable::thaw() before passing it to child_data_hook.

pf_done ( $exitcode )

To be called by the main process when you are done.

Sends all child processes a SIGTERM, waits for all child processes to terminate, reads remaining child data and executes child_data_hook if necessary.

Exits with $exitcode if given, returns otherwise.

VARIABLES

$Parallel::MPM::Prefork::error

Holds an error message if pf_init() failed or the data provided by pf_kid_yell() or pf_kid_exit() could not be serialized or deserialized.

$Parallel::MPM::Prefork::VERSION

The module's version.

NOTES

You Don't Mess with the ZIGCHLD

Parallel::MPM::Prefork relies on SIGCHLD being delivered to its own handler in the main process (installed by pf_init()) and select() being interrupted by at least SIGCHLD.

Forking your own processes

Parallel::MPM::Prefork creates a process group with setpgrp() which allows it to wait only for its own child processes. That is, if you want to fork and wait for an independent child process you just call setpgrp() in the child and waitpid($pid, ...) in the main process.

system(LIST) can be replaced by system('setsid', LIST);

However, Parallel::MPM::Prefork will still catch SIGCHLD (see previous note).

Difference to Parallel::ForkManager

With Parallel::ForkManager, the main process decides in advance how much work there is to do, how to split it up and how many child processes will work in parallel. A child is always considered busy.

With Parallel::MPM::Prefork, the child processes take on work automatically as it arrives. A child may be busy or idle. The main process only makes sure there are always enough child processes available without too many idling around.

Keep in mind that these are completely different use cases.

SEE ALSO

Net::Server::Prefork

Similar to Parallel::MPM::Prefork but limited to serving network connections. Heavyweight hook-laden OO style. A pain in the ass when it comes to customizing signal handling but its pipe concept for managing child processes rocks. Inspired the creation of this module.

Parallel::ForkManager

Different use case (see NOTES). Waits for all child processes, not only its own offspring; we don't (see NOTES). Features the awesome inline child code paradigm.

Parallel::Prefork::SpareWorkers

Kind of a hybrid between Net::Server::Prefork and Parallel::ForkManager. Fails to manage the workers if you want to keep them alive.

ACKNOWLEDGEMENTS

Thanks to the UN for not condemning child labor on the operating system level.

COPYRIGHT

Copyright © 2013 Carsten Gaebler (cgpan ʇɐ gmx ʇop de). All rights reserved.

I only accept encrypted e-mails, either via SMIME or GPG.

LICENSE

This program is free software. You can redistribute and/or modify it under the same terms as Perl itself.