The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

forks - drop-in replacement for Perl threads using fork()

VERSION

This documentation describes version 0.23.

SYNOPSIS

  use forks;
  use warnings;

  my $thread = threads->new( sub {       # or ->create or async()
    print "Hello world from a thread\n";
  } );

  $thread->join;
  
  $thread = threads->new( { 'context' => 'list' }, sub {
    print "Thread is expected to return a list\n";
    return (1, 'abc', 5);
  }
  my @result = $thread->join();

  threads->detach;
  $thread->detach;

  my $tid    = $thread->tid;
  my $owntid = threads->tid;

  my $self    = threads->self;
  my $threadx = threads->object( $tidx );

  my @running = threads->list(threads::running);
  $_->join() foreach (threads->list(threads::joinable));
  $_->join foreach threads->list; #block until all threads done

  unless (fork) {
    threads->isthread; # could be used a child-init Apache handler
  }

  # Enable debugging
  use forks qw(debug);
  threads->debug( 1 );
  
  # Stringify thread objects
  use forks qw(stringify);
  
  # Check state of a thread
  my $thr = threads->new( ... );
  if ($thr->is_running()) {
    print "Thread $thr running\n"; #prints "Thread 1 running"
  }
  
  # Send a signal to a thread
  $thr->kill('SIGUSR1');

  # Manual deadlock detection
  if ($thr->is_deadlocked()) {
    print "Thread $thr is currently deadlocked!\n";
  }
  
  # Use forks as a drop-in replacement for an ithreads application
  perl -Mforks -Mforks::shared threadapplication
  

See "SYNOPSYS" in threads for more examples.

DESCRIPTION

The "forks" pragma allows a developer to use threads without having to have a threaded perl, or to even run 5.8.0 or higher.

Refer to the threads module for ithreads API documentation. Also, use

    perl -Mforks -e 'print $threads::VERSION'
    

to see what version of threads you should refer to regarding supported API features.

There were a number of goals that I am trying to reach with this implementation.

    Using this module only makes sense if you run on a system that has an implementation of the fork function by the Operating System. Windows is currently the only known system on which Perl runs which does not have an implementation of fork. Therefore, it doesn't make any sense to use this module on a Windows system. And therefore, a check is made during installation barring you from installing on a Windows system.

memory usage

The standard Perl 5.8.0 threads implementation is very memory consuming, which makes it basically impossible to use in a production environment, particularly with mod_perl and Apache. Because of the use of the standard Unix fork() capabilities, most operating systems will be able to use the Copy-On-Write (COW) memory sharing capabilities (whereas with the standard Perl 5.8.0 threads implementation, this is thwarted by the Perl interpreter cloning process that is used to create threads). The memory savings have been confirmed.

mod_perl / Apache

This threads implementation allows you to use a standard, pre-forking Apache server and have the children act as threads (with the class method "isthread").

same API as threads

You should be able to run threaded applications unchanged by simply making sure that the "forks" and "forks::shared" modules are loaded, e.g. by specifying them on the command line. Forks is currently API compatible with CPAN threads version 1.53.

Additionally, you do not need to worry about upgrading to the latest Perl maintenance release to insure that the (CPAN) release of threads you wish to use is fully compatibly and stable. Forks code is completely independent of the perl core, and thus will guarantee reliable behavior on any release of Perl 5.8 or later. (Note that there may be behavior variances if running under Perl 5.6.x, as that version does not support safe signals and requires a source filter to load forks).

using as a development tool

Because you do not need a threaded Perl to use forks.pm, you can start prototyping threaded applications with the Perl executable that you are used to. Just download and install the "forks" package from CPAN. So the threshold for trying out threads in Perl has become much lower. Even Perl 5.005 should, in principle, be able to support the forks.pm module; however, some issues with regards to the availability of XS features between different versions of Perl, it seems that 5.6.0 (unthreaded) is what you need at least.

Additionally, forks offers a full thread deadlock detection engine, to help discover and optionally resolve locking issues in threaded applications. See "Deadlock detection and resolution" in forks::shared for more information.

using in production environments

This package has successfully been proven as stable and reliable in production environments. I have personally used it in high-availability, database-driven, message processing server applications since 2004 with great success.

Also, unlike pure ithreads, forks.pm is fully compatible with all perl modules, whether or not they have been updated to be ithread safe. This means that you do not need to feel limited in what you can develop as a threaded perl application, a problem that continues to plague the acceptance of ithreads in production enviroments today. Just handle these modules as you would when using a standard fork: be sure to create new instances of, or connections to, resources where a single instance can not be shared between multiple processes.

The only major concern is the potentially slow (relative to pure ithreads) performance of shared data and locks. If your application doesn't depend on extensive semaphore use, and reads/writes from shared variables moderately (such as using them primarily to deliver data to a child thread to process and the child thread uses a shared structure to return the result), then this will likely not be an issue for your application. See the TODO section regarding plans to tackle this issue.

Also, you may wish to try forks::BerkeleyDB, which has shown signifigant performance gains and consistent throughoutput in high-concurrency shared variable applications.

Perl built without native ithreads

If your Perl release was not built with ithreads or does not support ithreads, you will have a compile-time option of installing forks into the threads and threads::shared namespaces. This is done as a convenience to give users a reasonably seamless ithreads API experience without having to rebuild their distribution with native threading (and its slight performance overhead on all perl runtime, even if not using threads).

Note: When using forks in this manner (e.g. "use threads;") for the first time in your code, forks will attempt to behave identically to threads relative to the current version of threads it supports (refer to $threads::VERSION), even if the behavior is (or was) considered a bug. At this time, this means that shared variables will lose their pre-existing value at the time they are shared and that splice will die if attempted on a shared scalar.

If you use forks for the first time as "use forks" and other loaded code uses "use threads", then this threads behavior emulation does not apply.

REQUIRED MODULES

 Devel::Required (0.07)
 File::Spec (any)
 IO::Socket (1.18)
 List::MoreUtils (0.15)
 reaper (0.03)
 Scalar::Util (1.01)
 Storable (any)
 Time::HiRes (any)

IMPLEMENTATION

This version is mostly written in Perl. Inter-process communication is done by using sockets, with the process that stores the shared variables as the server and all the processes that function as threads, as clients.

why sockets?

The reason I chose sockets for inter-thread communication above using a shared memory library, is that a blocking socket allows you to elegantly solve the problem of a thread that is blocking for a certain event. Any polling that might occur, is not occurring at the Perl level, but at the level of the socket, which should be much better and probably very optimized already.

EXTRA CLASS METHODS

Apart from the standard class methods, the following class methods are supplied by the "forks" threads implementation.

isthread

 unless (fork) {
   threads->isthread; # this process is a detached thread now
   exit;              # can not return values, as thread is detached
 }

The isthread class method attempt to make a connection with the shared variables process. If it succeeds, then the process will function as a detached thread and will allow all the threads methods to operate.

This method is mainly intended to be used from within a child-init handler in a pre-forking Apache server. All the children that handle requests become threads as far as Perl is concerned, allowing you to use shared variables between all of the Apache processes.

debug

 threads->debug( 1 );
 $debug = threads->debug;

The "debug" class method allows you to (re)set a flag which causes extensive debugging output of the communication between threads to be output to STDERR. The format is still subject to change and therefore still undocumented.

Debugging can only be switched on by defining the environment variable THREADS_DEBUG. If the environment variable does not exist when the forks.pm module is compiled, then all debugging code will be optimised away to create a better performance. If the environment variable has a true value, then debugging will also be enabled from the start.

EXTRA FEATURES

Deadlock detection

Forks also offers a full thread deadlock detection engine, to help discover and optionally resolve locking issues in threaded applications. See "Deadlock detection and resolution" in forks::shared for more information.

INET socket IP mask

For security, inter-thread communication INET sockets only will allow connections from the default local machine IPv4 loopback address (e.g 127.0.0.1). However, this filter may be modified by defining the environment variable THREADS_IP_MASK with a standard perl regular expression (or with no value, which would disable the filter).

UNIX socket support

For users who do not wish to (or can not) use TCP sockets, UNIX socket support is available. This can be only switched on by defining the environment variable THREADS_SOCKET_UNIX. If the environment variable has a true value, then UNIX sockets will be used instead of the default TCP sockets. Socket descriptors are currently written to /var/tmp and given a+rw access by default (for cleanest functional support on multi-user systems).

This feature is excellent for applications that require extra security, as it does not expose forks.pm to any INET vunerabilities your system may be subject to (i.e. systems not protected by a firewall). It also may provide an additional performance boost, as there is less system overhead necessary to handle UNIX vs INET socket communication.

NOTES

Some imporant items you should be aware of.

Signal behavior

Unlike ithreads, signals being sent are standard OS signals, so you should program defensively if you plan to use inter-thread signals.

Also, be aware that certain signals may untrappable depending on the target platform, such as SIGKILL and SIGSTOP. Thus, it is recommended you only use normal signals (such as TERM, INT, HUP, USR1, USR2) for inter-thread signal handling.

Modifying signals

Since the threads API provides a method to send signals between threads (processes), untrapped normal and error signals are defined by forks with a basic CORE::exit() shutdown function to provide safe termination.

Thus, if you (or any modules you use) modify signal handlers, it is important that the signal handlers at least remain defined and are not undefined (for whatever reason). The system signal handler default, usually abnormal process termination which skips END blocks, may cause undesired behavior if a thread exits due to an unhandled signal.

Modules that modify $SIG{CHLD}

In order to be compatible with perl's core system() function on all platforms, extra care has gone into implementing a smarter $SIG{CHLD} in forks.pm. If any modules you use modify $SIG{CHLD} (or if you attempt to modify it yourself), you may end up with undesired issues such as unreaped processes or a system() function that returns -1 instead of the correct exit value. See perlipc for more information regarding common issues with modifying $SIG{CHLD}.

If $SIG{CHLD} has to be modified in any way by your software, please take extra care to implement a handler that follows the requirements of chained signal handlers. See reaper for more information.

You may define the environment variable THREADS_SIGCHLD_IGNORE to to force forks to use 'IGNORE' on systems where a custom CHLD signal handler has been automatically installed to support correct exit code of perl core system() function. There should be no need to use this unless you encounter specific issues with reaper signal chaining.

CAVEATS

Some caveats that you need to be aware of.

Greater latency

Because of the use of sockets for inter-thread communication, there is an inherent larger latency with the interaction between threads. However, the fact that TCP sockets are used, may open up the possibility to share threads over more than one physical machine.

You may decrease some latency by using UNIX sockets (see "UNIX socket support").

Also, you may wish to try forks::BerkeleyDB, which has shown signifigant performance gains and consistent throughoutput in applications requiring high-concurrency shared variable access.

Module CLONE functions and threads

In rare cases, module CLONE functions may have issues when being auto-executed by a new thread (forked process). This only affects modules that use XS data (objects or struts) created by to external C libraries. If a module attempts to CLONE non-fork safe XS data, at worst it may core dump only the newly created thread (process).

If you treat such sensitive resources (such as DBI driver instances) as non-thread-safe by default and close these resources prior to creating a new thread, you should never encounter any issues.

Signals and safe-signal enabled Perl

In order to use signals, you must be using perl 5.8 compiled with safe signal support. Otherwise, you'll get a terminal error like "Cannot signal threads without safe signals" if you try to use signal functions.

Source filter

To get forks.pm working on Perl 5.6.x, it was necessary to use a source filter to ensure a smooth upgrade path from using forks under Perl 5.6.x to Perl 5.8.x and higher. The source filter used is pretty simple and may prove to be too simple. Please report any problems that you may find when running under 5.6.x.

TODO

See the TODO file in the distribution.

KNOWN PROBLEMS

These problems are known and will hopefully be fixed in the future:

inter-thread signaling is experimental and potentially unstable

This feature is considered experimental and has rare synchronization issues when sending a signal to a process in the middle of sending or receiving socket data pertaining to a threads operation. This will be addressed in a future release.

test-suite exits in a weird way

Although there are no errors in the test-suite, the test harness sometimes thinks there is something wrong because of an unexpected exit() value. This is an issue with Test::More's END block, which wasn't designed to co-exist with a threads environment and forked processes. Hopefully, that module will be patched in the future, but for now, the warnings are harmless and may be safely ignored.

And of course, there might be other, undiscovered issues. Patches are welcome!

CREDITS

Refer to the CREDITS file included in the distribution.

CURRENT MAINTAINER

Eric Rybski <rybskej@yahoo.com>.

ORIGINAL AUTHOR

Elizabeth Mattijsen, <liz@dijkmat.nl>.

COPYRIGHT

Copyright (c) 2005-2006 Eric Rybski <rybskej@yahoo.com>, 2002-2004 Elizabeth Mattijsen <liz@dijkmat.nl>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

threads, forks::BerkeleyDB.