NAME
perlmulticore.h - the Perl Multicore Specification and Implementation
SYNOPSIS
#include "perlmultiore.h"
// in your XS function:
perlinterp_release ();
do_the_C_thing ();
perlinterp_acquire ();
DESCRIPTION
This header file implements a simple mechanism for XS modules to allow re-use of the perl interpreter for other threads while doing some lengthy operation, such as cryptography, SQL queries, disk I/O and so on.
The design goals for this mechanism were to be simple to use, very efficient when not needed, low code and data size overhead and broad applicability.
The newest version of this document can be found at http://pod.tst.eu/http://cvs.schmorp.de/Coro-Multicore/perlmulticore.h.
The newest version of the header file itself, which includes this documentation, can be downloaded from http://cvs.schmorp.de/Coro-Multicore/perlmulticore.h.
HOW DO I USE THIS IN MY MODULES?
The usage is very simple - you include this header file in your XS module. Then, before you do your lengthy operation, you release the perl interpreter:
perlinterp_release ();
And when you are done with your computation, you acquire it again:
perlinterp_acquire ();
And that's it. This doesn't load any modules and consists of only a few machine instructions when no module to take advantage of it is loaded.
Here is a simple example, an flock
wrapper implemented in XS. Unlike perl's built-in flock
, it allows other threads (for example, those provided by Coro) to execute, instead of blocking the whole perl interpreter. For the sake of this example, it requires a file descriptor instead of a handle.
#include "perlmulticore.h" // this header file
// and in the XS portion
int flock (int fd, int operation)
CODE:
perlinterp_release ();
RETVAL = flock (fd, operation);
perlinterp_acquire ();
OUTPUT:
RETVAL
Another example would be to modify DBD::mysql to allow other threads to execute while executing SQL queries. One way to do this is find all mysql_st_internal_execute
and similar calls (such as mysql_st_internal_execute41
), and adorn them with release/acquire calls:
{
perlinterp_release ();
imp_sth->row_num= mysql_st_internal_execute(sth, ...);
perlinterp_acquire ();
}
HOW ABOUT NOT-SO LONG WORK?
Sometimes you don't know how long your code will take - in a compression library for example, compressing a few hundred Kilobyte of data can take a while, while 50 Bytes will compress so fast that even attempting to do something else could be more costly than just doing it.
This is a very hard problem to solve. The best you can do at the moment is to release the perl interpreter only when you think the work to be done justifies the expense.
As a rule of thumb, if you expect to need more than a few thousand cycles, you should release the interpreter, else you shouldn't. When in doubt, release.
For example, in a compression library, you might want to do this:
if (bytes_to_be_compressed > 2000) perlinterp_release ();
do_compress (...);
if (bytes_to_be_compressed > 2000) perlinterp_acquire ();
Make sure the if conditions are exactly the same and don't change, so you always call acquire when you release, and vice versa.
When you don't have a handy indicator, you might still do something useful. For example, if you do some file locking with fcntl
and you expect the lock to be available immediately in most cases, you could try with F_SETLK
(which doesn't wait), and only release/wait/acquire when the lock couldn't be set:
int res = fcntl (fd, F_SETLK, &flock);
if (res)
{
// error, assume lock is held by another process and do it the slow way
perlinterp_release ();
res = fcntl (fd, F_SETLKW, &flock);
perlinterp_acquire ();
}
THE HARD AND FAST RULES
As with everything, there are a number of rules to follow.
- Never touch any perl data structures after calling
perlinterp_release
. -
Possibly the most important rule of them all, anything perl is completely off-limits after
perlinterp_release
, until you callperlinterp_acquire
, after which you can access perl stuff again.That includes anything in the perl interpreter that you didn't prove to be safe, and didn't prove to be safe in older and future versions of perl: global variables, local perl scalars, even if you are sure nobody accesses them and you only try to "read" their value, and so on.
If you need to access perl things, do it before releasing the interpreter with
perlinterp_release
, or after acquiring it again withperlinterp_acquire
. - Always call
perlinterp_release
andperlinterp_acquire
in pairs. -
For each
perlinterp_release
call there must be aperlinterp_acquire
call. They don't have to be in the same function, and you can have multiple calls to them, as long as everyperlinterp_release
call is followed by exactly oneperlinterp_acquire
call.For example., this would be fine:
perlinterp_release (); if (!function_that_fails_with_0_return_value ()) { perlinterp_acquire (); croak ("error"); // croak doesn't return } perlinterp_acquire (); // do other stuff
- Never nest calls to
perlinterp_release
andperlinterp_acquire
. -
That simply means that after calling
perlinterp_release
, you must callperlinterp_acquire
before callingperlinterp_release
again. Likewise, afterperlinterp_acquire
, you can callperlinterp_release
but not anotherperlinterp_acquire
. - Always call
perlinterp_release
first. -
Also simple: you must not call
perlinterp_acquire
without having calledperlinterp_release
before. - Never underestimate threads.
-
While it's easy to add parallel execution ability to your XS module, it doesn't mean it is safe. After you release the perl interpreter, it's perfectly possible that it will call your XS function in another thread, even while your original function still executes. In other words: your C code must be thread safe, and if you use any library, that library must be thread-safe, too.
Always assume that the code between
perlinterp_release
andperlinterp_acquire
is executed in parallel on multiple CPUs at the same time. If your code can't cope with that, you could consider using a mutex to only allow one such execution, which is still better than blocking everybody else from doing anything:static pthread_mutex_t my_mutex = PTHREAD_MUTEX_INITIALIZER; perlinterp_release (); pthread_mutex_lock (&my_mutex); do_your_non_thread_safe_thing (); pthread_mutex_unlock (&my_mutex); perlinterp_acquire ();
- Don't get confused by having to release first.
-
In many real world scenarios, you acquire a resource, do something, then release it again. Don't let this confuse you, with this, you already own the resource (the perl interpreter) so you have to release first, and acquire it again later, not the other way around.
DESIGN PRINCIPLES
This section discusses how the design goals were reached (you be the judge), how it is implemented, and what overheads this implies.
- Simple to Use
-
All you have to do is identify the place in your existing code where you stop touching perl stuff, do your actual work, and start touching perl stuff again.
Then slap
perlinterp_release ()
andperlinterp_acquire ()
around the actual work code.You have to include perlmulticore.h and distribute it with your XS code, but all these things border on the trivial.
- Very Efficient
-
The definition for
perlinterp_release
andperlinterp_release
is very short:#define perlinterp_release() perl_multicore_api->pmapi_release () #define perlinterp_acquire() perl_multicore_api->pmapi_acquire ()
Both are macros that read a pointer from memory (perl_multicore_api), dereference a function pointer stored at that place, and call the function, which takes no arguments and returns nothing.
The first call to
perlinterp_release
will check for the presence of any supporting module, and if none is loaded, will create a dummy implementation where bothpmapi_release
andpmapi_acquire
execute this function:static void perl_multicore_nop (void) { }
So in the case of no magical module being loaded, all calls except the first are two memory accesses and a predictable function call of an empty function.
Of course, the overhead is much higher when these functions actually implement anything useful, but you always get what you pay for.
With Coro::Multicore, every release/acquire involves two pthread switches, two coro thread switches, a bunch of syscalls, and sometimes interacting with the event loop.
A dedicated thread pool such as the one IO::AIO uses could reduce these overheads, and would also reduce the dependencies (AnyEvent is a smaller and more portable dependency than Coro), but it would require a lot more work on the side of the module author wanting to support it than this solution.
- Low Code and Data Size Overhead
-
On a 64 bit system, perlmulticore.h uses exactly
8
octets (one pointer) of your data segment, to store theperl_multicore_api
pointer. In addition it creates a16
octet perl string to store the function pointers in, and stores it in a hash provided by perl for this purpose.This is pretty much the equivalent of executing this code:
$existing_hash{perl_multicore_api} = "123456781234567812345678";
And that's it, which is, as I think, indeed very little.
As for code size, on my amd64 system, every call to
perlinterp_release
orperlinterp_acquire
results in a variation of the following 9-10 octet sequence:150> mov 0x200f23(%rip),%rax # <perl_multicore_api> 157> callq *0x8(%rax)
The biggest part if the initialisation code, which consists of 11 lines of typical XS code. On my system, all the code in perlmulticore.h compiles to less than 160 octets of read-only data.
- Broad Applicability
-
While there are alternative ways to achieve the goal of parallel execution with threads that might be more efficient, this mechanism was chosen because it is very simple to retrofit existing modules with it, and it
The design goals for this mechanism were to be simple to use, very efficient when not needed, low code and data size overhead and broad applicability.
DISABLING PERL MULTICORE AT COMPILE TIME
You can disable the complete perl multicore API by defining the symbol PERL_MULTICORE_DISABLE
to 1
(e.g. by specifying -DPERL_MULTICORE_DISABLE as compiler argument).
This will leave no traces of the API in the compiled code, suitable "empty" perl_release
and perl_acquire
definitions will be provided.
This could be added to perl's CPPFLAGS
when configuring perl on platforms that do not support threading at all for example.
AUTHOR
Marc A. Lehmann <perlmulticore@schmorp.de>
http://perlmulticore.schmorp.de/
LICENSE
The perlmulticore.h header file is put into the public domain. Where this is legally not possible, or at your option, it can be licensed under creativecommons CC0 license: https://creativecommons.org/publicdomain/zero/1.0/.