The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

IPC::Shareable - share Perl variables between processes

SYNOPSIS

  use IPC::Shareable;
  tie($scalar, IPC::Shareable, $glue, { %options });
  tie(%hash, IPC::Shareable, $glue, { %options });
  (tied %hash)->shlock;
  (tied %hash)->shunlock;

CONVENTIONS

The occurrence of a number in square brackets, as in [N], in the text of this document refers to a numbered note in the "NOTES".

DESCRIPTION

IPC::Shareable allows you to tie a a variable to shared memory making it easy to share the contents of that variable with other Perl processes. Currently either scalars or hashes can be tied; tying of arrays remains a work in progress. However, the variable being tied may contain arbitrarily complex data structures - including references to arrays, hashes of hashes, etc. See "REFERENCES" below for more information.

The association between variables in distinct processes is provided by $glue. This is an integer number or 4 character string[1] that serves as a common identifier for data across process space. Hence the statement

        tie($scalar, IPC::Shareable, 'data');

in program one and the statement

        tie($variable, IPC::Shareable, 'data');

in program two will bind $scalar in program one and $variable in program two. There is no pre-set limit to the number of processes that can bind to data; nor is there a pre-set limit to the size or complexity of the underlying data of the tied variables[2].

The bound data structures are all linearized (using Raphael Manfredi's Storable module) before being slurped into shared memory. Upon retrieval, the original format of the data structure is recovered. Semaphore flags are used for versioning and managing a per-process cache, allowing quick retrieval of data when, for instance, operating on a tie()d variable in a tight loop.

OPTIONS

Options are specified by passing a reference to a hash as the fourth argument to the tie function that enchants a variable. Alternatively you can pass a reference to a hash as the third argument; IPC::Shareable will then look at the field named 'key' in this hash for the value of $glue. So,

        tie($variable, IPC::Shareable, 'data', \%options);

is equivalent to

        tie($variable, IPC::Shareable,
            { 'key' => 'data', ... });

When defining an options hash, values that match the word 'no' in a case-insensitive manner are treated as false. Therefore, setting $options{'create'} = 'No'; is the same as $options{'create'} = 0;.

The following fields are recognized in the options hash.

key

The 'key' field is used to determine the $glue if $glue was not present in the call to tie(). This argument is then, in turn, used as the KEY argument in subsequent calls to shmget() and semget(). If this field is not provided, a value of IPC_PRIVATE is assumed, meaning that your variables cannot be shared with other processes. (Note that setting $glue to 0 is the same as using IPC_PRIVATE.)

create

If 'create' is set to a true value, IPC::Shareable will create a new binding associated with $glue if such a binding does not already exist. If 'create' is false, calls to tie() will fail (returning undef) if such a binding does not already exist. This is achieved by ORing IPC_PRIVATE into FLAGS argument of calls to shmget() when create is true.

exclusive

If 'exclusive' field is set to a true value, calls to tie() will fail (returning undef) if a data binding associated with $glue already exists. This is achieved by ORing IPC_ IPC_EXCL into the FLAGS argument of calls to shmget() when 'exclusive' is true.

mode

The mode argument is an octal number specifying the access permissions when a new data binding is being created. These access permission are the same as file access permissions in that 0666 is world readable, 0600 is readable only by the effective UID of the process creating the shared variable, etc. If not provided, a default of 0666 (world readable and writable) will be assumed.

destroy

If set to a true value, the data binding will be destroyed when the process calling tie() exits (gracefully)[3].

LOCKING

Shareable provides methods to implement application-level locking of the shared data structures. These methods are called shlock() and shunlock(). To use them you must first get the tied object, either by saving the return value of the original call to tie() or by using the built-in tied() function.

To lock a variable, do this:

  $knot = tie($scalar, IPC::Shareable, $glue, { %options });
  ...
  $knot->shlock;

or equivalently

  tie($scalar, IPC::Shareable, $glue, { %options });
  (tied $scalar)->shlock;

This will place an exclusive lock on the data of $scalar.

To unlock a variable do this:

  $knot->shunlock;

or

  (tied $scalar)->shunlock;

Note that there is no mechanism for shared locks, but you're probably safe to rely on Shareable's internal locking mechanism in situations that would normally call for a shared lock so that's not a big drawback. In general, a lock only needs to be applied during a non-atomic write operation. For instance, a statement like

  $scalar = 10;

doesn't really need a lock since it's atomic. However, if you want to increment, you really should do

  (tied $scalar)->shlock;
  ++$scalar;
  (tied $scalar)->shunlock;

since ++$scalar is non-atomic.

Read-only operations are (I think) atomic so you don't really need to lock for them.

There are some pitfalls regarding locking and signals that you should make yourself aware of; these are discussed in "NOTES".

REFERENCES

If a variable tie()d to Shareable contains references, Shareable acts in different ways depending upon the initial state of the thingy being referenced.

The Thingy Referenced Is Initially False

If Shareable encounters in a tie()d variable a reference to an empty hash or a scalar with a false value, Shareable will attempt to tie() the hash or scalar being referenced. If a reference is to an empty array, Shareable defaults to its other behaviour described below since Shareable cannot tie() arrays.

References to empty hashes can occur whenever a tie()d variable is cast in a context that forces references to "spring into existence". Consider, for instance, the following assignment to a tie()d %hash:

    $hash{'foo'}{'bar'} = 'xyzzy';

This statement assigns assigns to $hash{'foo'} a reference to an anonymous hash. In the anonymous hash it assigns to the key 'bar' the value 'xyzzy'. Since %hash is tie()d, the assignment triggers Shareable, but when Shareable is called, the anonymous hash is still empty. Shareable then immediately tie()s the anonymous hash so that when the assignment { 'bar' = 'xyzzy' } is made, Shareable can catch it.

One consequence of this behaviour is a statement like

    $scalar = {};

will, for a tie()d $scalar, Shareable to tie() the anonymous hash. Consider this a supported bug. It does, however mean that statements like

    $scalar->{'foo'} = 'bar';

should work as expected.

Be warned, however, that each variable tie()d to Shareable requires (at least) one shared memory segment and one set of three semaphores. If you use this feature too liberally, you can find yourself running out of semaphores quickly. If that happens to you, consider resorting to Shareable other behaviour described in the following section.

Another potential problem at the time of writing with using this behaviour is that locking using shlock() and shunlock() is unreliable. This is because a data structure spans more than one tie()d variable. It is advisable to implement your own locking mechanism if you plan on using this behaviour of Shareable.

The Thingy Referenced Is Initially True

If Shareable encounters in a tie()d variable a reference to a hash with any key/value pairs, a reference to a true scalar, or a reference to any array, the contents of the referenced thingy are slurped into the same shared memory segment as the original tie()d variable. What that means is that a statement like

    $scalar = [ 0 .. 9 ];

makes the contents of the anonymous array referenced by a tie()d $scalar visible to other processes.

The good side of this behaviour is that a data structure can be arbitrarily complex and still only require one set of three semaphores. The downside becomes evident when you try to modify the contents of such a referenced thingy, either in the original process or elsewhere. A statement like

    push(@$scalar, 10, 11, 12);

modifies only the untied anonymous array referenced by $scalar and not the tie()d $scalar itself. Subsequently, the change to the anonymous array would be visible only in the process making this statement.

A workaround is to remember which variable is really tie()d and to make sure you assign into that variable every time you change a thingy that it references. An alternative to the above statement that works is

    $scalar = [ (@$scalar, 10, 11, 12) ];

EXAMPLES

In a file called server:

    #!/usr/bin/perl -w
    use IPC::Shareable;
    $glue = 'data';
    %options = (
        'create' => 'yes',
        'exclusive' => 'no',
        'mode' => 0644,
        'destroy' => 'yes',
    );
    tie(%colours, IPC::Shareable, $glue, { %options }) or
        die "server: tie failed\n";
    %colours = (
        'red' => [
             'fire truck',
             'leaves in the fall',
        ],
        'blue' => [
             'sky',
             'police cars',
        ],
    );
    (print("server: there are 2 colours\n"), sleep 5)
        while scalar keys %colours == 2;
    print "server: here are all my colours:\n";
    foreach $colour (keys %colours) {
        print "server: these are $colour: ",
            join(', ', @{$colours{$colour}}), "\n";
    }
    exit;

In a file called client

    #!/usr/bin/perl -w
    use IPC::Shareable;
    $glue = 'data';
    %options = (
        'key' => 'paint',
        'create' => 'no',
        'exclusive' => 'no',
        'mode' => 0644,
        'destroy' => 'no',
        );
    tie(%colours, IPC::Shareable, $glue, { %options }) or
        die "client: tie failed\n";
    foreach $colour (keys %colours) {
        print "client: these are $colour: ",
            join(', ', @{$colours{$colour}}), "\n";
    }
    delete $colours{'red'};
    exit;

And here is the output (the sleep commands in the command line prevent the output from being interrupted by shell prompts):

    bash$ ( ./server & ) ; sleep 10 ; ./client ; sleep 10
    server: there are 2 colours
    server: there are 2 colours
    server: there are 2 colours
    client: these are blue: sky, police cars
    client: these are red: fire truck, leaves in the fall
    server: here are all my colours:
    server: these are blue: sky, police cars

RETURN VALUES

Calls to tie() that try to implement IPC::Shareable will return true if successful, undef otherwise. The value returned is an instance of the IPC::Shareable class.

INTERNALS

When a variable is tie()d, a blessed reference to a SCALAR is created. (This is true even if it is a HASH being tie()d.) The value thereby referred is an integer[4] ID that is used as a key in a hash called %IPC::Shareable::Shm_Info; this hash is created and maintained by IPC::Shareable to manage the variables it has tie()d. When IPC::Shareable needs to perform an operation on a tie()d variable, it dereferences the blessed reference to perform a lookup in %IPC::Shareable::Shm_Info for the information needed to proceed.

%IPC::Shareable::Shm_Info has the following structure:

    %IPC::Shareable::Shm_Info = (

        # - The ID of an enchanted variable
        $id => {

            # -  A literal indicating the variable type
            'type' => 'SCALAR' || 'HASH',

            # - The I<$glue> used when tie() was called
            'key' => $glue,

            # - Shm segment IDs for this variable
            'frag_id' => {
                '0' => $id_1, # - ID of first shm segment
                '1' => $id_2, # - ID of next shm segment
                ... # - etc
            },

            # - ID of associated semaphores
            'sem_id' => $semid,

            # - The options passed when tie() was called
            'options' => { %options },

            # - The value of FLAGS for shmget() calls.
            'flags' => $flags,

            # - Destroy shm segements on exit?
            'destroy' => $destroy,
                    ;
            # - The version number of the cached data
            'version' => $version,

            # - A flag that indicates if this process
            # - has a lock on this variable
            'lock' => $lock_flag,

            # - A flag that indicates whether an
            # - iteration of this variable is in
            # - progress and we should use the local
            # - cache only until the iteration is over.
            # - Meaningless for scalars.
            'hash_iterating' => $iteration_flag,

            # - Data cache; data will be retrieved from
            # - here when this process's version is the
            # - same as the public version, or when we
            # - have a hash in the middle of some kind
            # - of iteration
            'DATA' => {
                # - User data; where the real
                # - information is stored
                'user' => \$data || \%data,
                # - Internal data used by Shareable to
                # - attach to any thingies referenced
                # - by this variable; see REFERENCES
                # - above
                'internal => {
                    # - Identifier of a thingy attached
                    # - to this variable
                    $string_1 => {
                        # - The shared memory id of the
                        # - attached thingy
                        'shm_id' => $attached_shmid,
                        # - The $glue used when tie()ing
                        # - to this thingy
                        'key' => $glue,
                        # - Type of thingy to attach to
                        'ref_type' => $type,
                        # - Where to store the reference
                        # - to this thingy
                        'hash_key' => $hash_key,
                    },
                    $string_2 => {
                        ... # - Another set of keys like
                            # - $string_1
                    },
                    ... # - Additional $string_n's if
                        # - need be.
                },
            },

            # - List of associated data structures, and 
            # - flags that indicate if this process has
            # - successfully attached to them
            'attached' => {
                $string_1 => $attached_flag1,
                $string_2 => $attached_flag2,
            },

            
            },
       ... # - IDs of additional tie()d variables
   );

Perhaps the most important thing to note the existence of the 'DATA' and 'version' fields: data for all tie()d variables is stored locally in a per-process cache. When storing data, the values of the semaphores referred to by $Shm_Info{$id}{'sem_id'} are changed to indicate to the world a new version of the data is available. When retrieving data for a tie()d variables, the values of these semaphores are examined to see if another process has created a more recent version than the cached version. If a more recent version is available, it will be retrieved from shared memory and used. If no more recent version has been created, the cached version is used.

Also stored in the 'DATA' field is a structure that identifies any "magically created" tie()d variables associated with this variable. These variables are created by assignments like the following:

    $hash{'foo'}{'bar'} = 'xyzzy';

See "REFERENCES" for a complete explanation.

Another important thing to know is that IPC::Shareable allocates shared memory of a constant size SHM_BUFSIZ, where SHM_BUFSIZ is defined in this module. If the amount of (serialized) data exceeds this value, it will be fragmented into multiple segments during a write operation and reassembled during a read operation.

Lastly, if notice that if you tie() a hash and begin iterating over it, you will get data from and write to your local cache until Shareable thinks you've reached the end of the iteration. At this point Shareable writes out the entire contents of your hash to shared memory. This is done so you can safely iterate via keys(), values(), and each() without having to worry about somebody else clobbering a key in the middle of the loop.

AUTHOR

Benjamin Sugars <bsugars@canoe.ca>

NOTES

Footnotes from the above sections

  1. If $glue is longer than 4 characters, only the 4 most significant characters are used. These characters are turned into integers by unpack()ing them. If $glue is less than 4 characters, it is space padded.

  2. IPC::Shareable provides no pre-set limits, but the system does. Namely, there are limits on the number of shared memory segments that can be allocated and the total amount of memory usable by shared memory.

  3. If the process has been smoked by an untrapped signal, the binding will remain in shared memory. If you're cautious, you might try

        $SIG{INT} = \&catch_int;
        sub catch_int {
            exit;
        }
        ...
        tie($variable, IPC::Shareable, 'data',
            { 'destroy' => 'Yes!' });

    which will at least clean up after your user hits CTRL-C because IPC::Shareable's DESTROY method will be called. Or, maybe you'd like to leave the binding in shared memory, so subsequent process can recover the data...

  4. The integer happens to be the shared memory ID of the first shared memory segment used to store the variable's data.

General Notes

o

When using shlock() to lock a variable, be careful to guard against signals. Under normal circumstances, Shareable's DESTROY method unlocks any locked variables when the process exits. However, if an untrapped signal is received while a process holds an exclusive lock, DESTROY will not be called and the lock may be maintained even though the process has exited. If this scares you, you might be better off implementing your own locking methods.

o

The bulk of Shareable's behaviour when dealing with references relies on undocumented (and possibly unsupported) features of perl. Changes to perl in the future could break Shareable.

o

As mentioned in "INTERNALS", shared memory segments are acquired with sizes of SHM_BUFSIZ. SHM_BUFSIZ's largest possible value is nominally SHMMAX, which is highly system-dependent. Indeed, for some systems it may be defined at boot time. If you can't seem to tie() any variables, it may be that SHM_BUFSIZ is set a value that exceeds SHMMAX on your system. Try reducing the size of SHM_BUFSIZ and recompiling the module.

o

The class contains a translation of the constants defined in the <sys/ipc.h>, <sys/shm.h>, and <sys/sem.h> header files. These constants are used internally by the class and cannot be imported into a calling environment. To do that, use IPC::SysV instead. Indeed, I would have used IPC::SysV myself, but I haven't been able to get it to compile on any system I have access to :-(.

o

Use caution when choosing your values of $glue. If IPC::Shareable needs to acquire more shared memory segments (due to a buffer overrun, or implicit referencing), those shared memory segments will have a different $glue than the $glue supplied by the application. In general, $glues should be well separated: aaaa and zzzz are good choices, since they are unlikely to collide, but aaaa and aaab could easily collide.

o

There is a program called ipcs(1/8) that is available on at least Solaris and Linux that might be useful for cleaning moribund shared memory segments or semaphore sets produced by bugs in either IPC::Shareable or applications using it.

o

IPC::Shareable version 0.20 or greater does not understand the format of shared memory segments created by earlier versions of IPC::Shareable. If you try to tie to such segments, you will get an error. The only work around is to clear the shared memory segments and start with a fresh set.

o

Set the variable $IPC::Shareable::Debug to a true value to produce *many* verbose debugging messages on the standard error (I don't use the Perl debugger as much as I should... )

CREDITS

Thanks to all those with comments or bug fixes, especially Stephane Bortzmeyer <bortzmeyer@pasteur.fr>, Michael Stevens <michael@malkav.imaginet.co.uk>, Richard Neal <richard@imaginet.co.uk>, Jason Stevens <jstevens@chron.com>, Maurice Aubrey <maurice@hevanet.com>, and Doug MacEachern <dougm@telebusiness.co.nz>.

BUGS

Certainly; this is alpha software. When you discover an anomaly, send me an email at bsugars@canoe.ca.

SEE ALSO

perl(1), perltie(1), Storable(3), shmget(2) and other SysV IPC man pages.