The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

forks - drop-in replacement for Perl threads using fork()

SYNOPSIS

  use forks;

  my $thread = threads->new( sub {       # or ->create or async()
    print "Hello world from a thread\n";
  } );

  $thread->join;

  threads->detach;
  $thread->detach;

  my $tid    = $thread->tid;
  my $owntid = threads->tid;

  my $self    = threads->self;
  my $threadx = threads->object( $tidx );

  threads->yield();

  $_->join foreach threads->list;

  unless (fork) {
    threads->isthread; # intended to be used in a child-init Apache handler
  }

  use forks qw(debug);
  threads->debug( 1 );

  perl -Mforks -Mforks::shared threadapplication

DESCRIPTION

The "forks" pragma allows a developer to use threads without having to have a threaded perl, or to even run 5.8.0 or higher. There were a number of goals that I am trying to reach with this implementation.

    Using this module only makes sense if you run on a system that has an implementation of the fork function by the Operating System. Windows is currently the only known system on which Perl runs which does not have an implementation of fork. Therefore, it doesn't make any sense to use this module on a Windows system. And therefore, a check is made during installation barring you from installing on a Windows system.

memory usage

The standard Perl 5.8.0 threads implementation is very memory consuming, which makes it basically impossible to use in a production environment, particularly with mod_perl and Apache. Because of the use of the standard Unix fork() capabilities, most operating systems will be able to use the Copy-On-Write (COW) memory sharing capabilities (whereas with the standard Perl 5.8.0 threads implementation, this is thwarted by the Perl interpreter cloning process that is used to create threads). The memory savings have been confirmed.

mod_perl / Apache

This threads implementation allows you to use a standard, pre-forking Apache server and have the children act as threads (with the class method isthread). This is as yet untested within Apache, but should work.

same API as threads

You should be able to run threaded applications unchanged by simply making sure that the "forks.pm" and "forks::shared.pm" modules are loaded, e.g. by specifying them on the command line.

using as a development / demonstration tool

Because you do not need a threaded Perl to use forks.pm, you can start prototyping threaded applications with the Perl executable that you are used to. Just download the "forks.pm" package from CPAN and install that. So the threshold for trying out threads in Perl has become much lower. Even Perl 5.005 should in principle be able to support the forks.pm module: because of some issues with regards to the availability of XS features between different versions of Perl, it seems that 5.6.0 (unthreaded) is what you need at least.

using in production environments

This package has successfully been proven as stable and reliable in production environments. I have personally used it in high-availability, database-driven, financial message processing server applications for more than a year now with great success. Also, unlike pure ithreads, forks.pm is fully compatible with all perl modules, whether or not they have been updated to be ithread safe. This means that you do not need to feel limited in what you can develop as a threaded perl application, a problem that continues to plague the acceptance of ithreads in production enviroments today. Just handle these modules as you would when using a standard fork: be sure to create new instances of or connections to resources where a single instance can not be shared between multiple processes.

The only major issue yet to tackle is the potentially slow (relative to pure ithreads) performance of shared data and lock use. If your application doesn't depend on extensive semaphore use, and reads/writes from shared variables limitedly (such as using them primarily to deliver data to a child thread to process and the child thread uses a shared structure to return the result), then this will likely not be an issue for your application. See the TODO section regarding my plans to tackle this issue.

REQUIRED MODULES

 Devel::Required (any)
 IO::Socket (1.18)
 Scalar::Util (1.01)
 Storable (any)

IMPLEMENTATION

This version is mostly written in Perl. Inter-process communication is done by using sockets, with the process that stores the shared variables as the server and all the processes that function as threads, as clients.

why sockets?

The reason I chose sockets for inter-thread communication above using a shared memory library, is that a blocking socket allows you to elegantly solve the problem of a thread that is blocking for a certain event. Any polling that might occur, is not occurring at the Perl level, but at the level of the socket, which should be much better and probably very optimized already.

EXTRA CLASS METHODS

Apart from the standard class methods, the following class methods are supplied by the "forks" threads implementation.

isthread

 unless (fork) {
   threads->isthread; # this process is a detached thread now
   exit;              # can not return values, as thread is detached
 }

The "isthread" class method attempt to make a connection with the shared variables process. If it succeeds, then the process will function as a detached thread and will allow all the threads methods to operate.

This method is mainly intended to be used from within a child-init handler in a pre-forking Apache server. All the children that handle requests become threads as far as Perl is concerned, allowing you to use shared variables between all of the Apache processes.

debug

 threads->debug( 1 );
 $debug = threads->debug;

The "debug" class method allows you to (re)set a flag which causes extensive debugging output of the communication between threads to be output to STDERR. The format is still subject to change and therefore still undocumented.

Debugging can only be switched on by defining the environment variable THREADS_DEBUG. If the environment variable does not exist when the forks.pm module is compiled, then all debugging code will be optimised away to create a better performance. If the environment variable has a true value, then debugging will also be enabled from the start.

EXTRA FEATURES

UNIX socket support

For users who do not wish to (or can not) use TCP sockets, UNIX socket support is available. This can be only switched on by defining the environement variable THREADS_SOCKET_UNIX. If the environment variable has a true value, then UNIX sockets will be used instead of the default TCP sockets. Socket descriptors are currently written to /var/tmp and given a+rw access by default (for cleanest functional support on multi-user systems).

This feature is excellent for applications that require extra security, as it does not expose forks.pm to any INET vunerabilities your system may be subject to (such as systems that are not run behind a firewall). It also may provide an additional, minor performance boost, as there is less system overhead necessary to handle UNIX vs INET socket communication.

CAVEATS

Some caveats that you need to be aware of.

Greater latency

Because of the use of sockets for inter-thread communication, there is an inherent larger latency with the interaction between threads. However, the fact that TCP sockets are used, may open up the possibility to share threads over more than one physical machine.

Source filter

To get forks.pm working on Perl 5.6.x, it was necessary to use a source filter to ensure a smooth upgrade path from using forks under Perl 5.6.x to Perl 5.8.x and higher. The source filter used is pretty simple and may prove to be too simple. Please report any problems that you may find when running under 5.6.x.

TODO

It would be an idea to add the feature to transparently bless across threads (which is promised in the documentation of threads.pm, but which I personally don't see happening before Ponie and/or Perl 6 comes around).

I'm looking for suggestions on ways to decrease latency of inter-thread communication. I have been mulling over the idea of implementing parts of the forks::shared interface with different transport interfaces (such as SysV shared memory for variable data and locks), and providing these as drop-in replacements for forks::shared. My goal is to try to create a set of flexible alternatives that best fit needs of end application developers.

Add a method to give the user run-time privilage control over unix socket file descriptors (for additional security control).

And of course, there might still be bugs in there. Patches are welcome!

KNOWN PROBLEMS

These problems are known and will hopefully be fixed in the future:

signalling unlocked variables

In the standard Perl ithreads implementation, you can signal a variable without having to lock() it. This causes a (suppressable) warning. Due to implementation details, probably having to do with communication getting out of sync between server and client thread, forks.pm needs to die when this happens. Patches are welcome.

test-suite exits in a weird way

Although there are no errors in the test-suite, the test harness sometimes thinks there is something wrong because of an unexpected exit() value. Not sure what to do about this yet. This appears to only occur on instances of perl built with native ithreads.

share() doesn't lose value for arrays and hashes

In the standard Perl threads implementation, arrays and hashes are re-initialized when they become shared (with the share()) function. The share() function of forks::shared does not initialize arrays and hashes when they become shared with the share() function.

This could be considered a bug in the standard Perl implementation. In any case this is an inconsistency of the behaviour of threads.pm and forks.pm. Maybe a special "totheletter" option should be added to forks.pm to make forks.pm follow this behaviour of threads.pm to the letter.

ORIGINAL AUTHOR CREDITS

All the people reporting problems and fixes. More specifically in alphabetical order:

Stephen Adkins

For finding that a child thread could not wake the very first parent thread with cond_signal, and providing a patch to fix it.

Arthur Bergman

For implementing the first working version of Perl threads support and providing us with an API to build on.

Lars Fenneberg

For helping me through the initial birthing pains.

Paul Golds

For spotting a problem with very large shared scalar values.

Bradley W. Langhorst

For making sure everything runs with warnings enabled.

Juerd Waalboer

For pointing me to the source filter solution for Perl 5.6.x.

CURRENT MAINTAINER

Eric Rybski <rybskej@yahoo.com>.

ORIGINAL AUTHOR

Elizabeth Mattijsen, <liz@dijkmat.nl>.

COPYRIGHT

Copyright (c) 2002-2004 Elizabeth Mattijsen <liz@dijkmat.nl>, 2005 Eric Rybski <rybskej@yahoo.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

threads.