NAME

RPC::Serialized - Subroutine calls over the network using common serialization

VERSION

version 1.123630

SYNOPSIS

# for the RPC server...

# choose one of the supplied server types (NetServer is Net::Server)
use RPC::Serialized::Server::NetServer;

my $s = RPC::Serialized::Server::NetServer->new;
$s->run;
    # server process is now looping and waiting for RPC (like Apache prefork)
    # the default port number for Net::Server is 20203

# and so for the RPC client...

use RPC::Serialized::Client::INET;

my $c = RPC::Serialized::Client::INET->new({
    io_socket_inet => {PeerPort => 20203},
});
 
my $result = $c->remote_sub_name(qw/ some data /);
    # remote_sub_name gets mapped to an invocation on the RPC server
    # it's best to wrap this in an eval{} block

DESCRIPTION

This module allows you to call a Perl subroutine in another process, possibly on another machine, using a very simple and extensible interface which ties together the features of other good modules from the CPAN.

There are lots of uses for RPC (remote procedure calls), so here are a couple of examples just to give you an idea:

Priveledge separation

If you have a web interface which is used to control a critical backend system, perhaps a key database or the settings on a firewall, you can use RPC to prevent security flaws on the web service from affecting the backend service. Only procedure calls which are permitted will be accepted from the web host, and it also offers a nice interface separation for your systems.

File or data access

To avoid sharing of filesystems over the network (SAMBA, NFS, etc), you can provide a restricted interface using RPC. For example a web service to search log files could send an RPC request with the search string to the log server, and display the results. There would be no need to run a network filesystem.

What makes this module different from another RPC implementation?

Data Serialization

This module uses Data::Serializer to construct its "on the wire" protocol. This means any Perl data structure can be sent or received, even Perl code itself in the case of some of the serialization modules (e.g. YAML supports this). You can also encrypt and compress data; all options to Data::Serializer are easily available through the configuration of this module.

Simple deployment

Each remote procedure is simply a perl subroutine in a module which is loaded by the server. You can let the server autoload everything as it is called, or specify each "handler" subroutine individually (or combine both!). Adding and modifying the available handlers is simple, meaning you think less about the RPC subsystem and more about your code and service provision.

Flexible configuration

All the modules used by RPC::Serialized can be fully configured as you wish, from one configuration file, or via options to new(). You saw an example of this in the "SYNOPSIS" section, above, for the IO::Socket::INET module.

The following sections take you through setting up an RPC server and client.

GENERAL CONFIGURATION

Both the client and server parts of this module share the same configuration system. There is a file, RPC::Serialized::Config, which contains the basic defaults suitable for most situations. Then, you can also specify the name of a configuration file as a parameter to any call to new(). Finally, a hash reference of options can be supplied directly to any call to new(). Let's go through these cases by example:

# this is for a server, but the example applies equally to a client
use RPC::Serialized::Server::NetServer;

# no configuration at all - use the built in defaults
$s = RPC::Serialized::Server::NetServer->new;

If you are happy with the settings in the source code of RPC::Serialized::Config, then no options are required. In the case of some types of server and client this is enough to get you going.

# load a configuration file, using Config::Any  
$s = RPC::Serialized::Server::NetServer->new('/path/to/config/file');

Alternatively, specify a file with configuration. We use Config::Any to load the file, and so it should contain a data structure in one of the formats supported by that module. Tip: make sure there is a filename suffix, e.g. .yml, to help Config::Any load the data. For details of the required structure of that data, read on...

# pass some options directly
$s = RPC::Serialized::Server::NetServer->new({
     net_server => {log_file => undef, port => 5233},
     rpc_serialized => {handler_namespaces => 'RPC::Serialized::Handler'},
});

You can pass a hash reference to new(). Each key in this hash is the name of a module to configure. The module names are converted to lowercase, and the :: separator is replaced by an underscore. In the example above we are providing some options to Net::Server and this module, RPC::Serialized.

The value of each key is another anonymous hash, this time with any options as specified by that module's own manual page. Of course, this only works for modules which use key/value options themselves, but thankfully that is the case for each module used by RPC::Serialized.

Remember, in all of the examples in this manual page which show passing configuration settings to new(), you can also achieve the same thing using a configuration file by passing its name to new() instead.

As a final note on this topic, you can provide both a configuration filename, and an hash reference of options to the new() call. In fact, whatever and however many of these you provide, they will be read in with the later ones taking prescedent.

SETTING UP A SERVER

You do not have to know too much about internet servers to use this module. As well as providing a standard TCP and UDP server, there are also UNIX socket and Standard Input/Output servers that you might use for communicating with clients on the same host system.

In the main, we are dealing with a UNIX world here, so most of the description will make assumptions about that. If you get this module running on Windows, please let the author know!

A small perl script which starts the server running is all you need. This can be copied verbatim from the "SYNOPSIS" section above. For guidance on providing configuration to the server, see "GENERAL CONFIGURATION", above.

When running a server process, it can either stay in the foreground, or detach from your shell. The default is to stay in the foreground, for two reasons. First, when you are developing you probably want to start the server and see what is happening on STDERR. Second, many people use Dan Bernstein's daemontools package to manage persistent servers, and this requires a process which does not detach from its parent process. If you are using the NetServer server, then it is easy to make it detach:

$s = RPC::Serialized::Server::NetServer->new({
     net_server => { background => 1, setsid => 1 },
});

To stop the server you will then have to issue a kill to the detached process.

There is a handful of alternative servers shipped with this module. For more details, please see the manual pages for each of them:

RPC::Serialized::Server::NetServer

This is a full-blown pre-forking internet server, with many many good features. You will have to install the Net::Server Perl module and dependencies to use this server. It supports TCP and UDP INET sockets, as well as UNIX domain sockets.

RPC::Serialized::Server::STDIO

This is a very simple server which processes one request at a time, accepting data on Standard Input and sending responses to Standard Output.

RPC::Serialized::Server::UCSPI::TCP

If you use Dan Bernstein's tcpserver (a.k.a. ucspi-tcp) then this is the option for you. It is based upon the STDIO server, above, and is designed to be fired up by tcpserver whenever an incoming connection is handled. There is an example script for this in the Perl distribution for this module.

RPC::Serialized::Server::UCSPI::IPC

This is similar to the option above, but for UNIX (i.e. local filesystem) sockets rather than INET sockets. It is designed for ucspi-ipc, producted by SuperScript Technology, Inc., and you can find more details by searching on Google.

How the server works

Each RPC message which comes in "over the wire" is really just a Perl data structure, a hash. There is a "CALL" key, which has the name of the RPC method to invoke, and some "ARGS" to pass to it as arguments.

The server looks at the CALL and tries to load and execute the handler which maps to that CALL. If it fails it raises an Exception, and fires that back to the client. If the invocation is successful, then the RESPONSE is sent back to the client, again in a Perl data structure. It is all quite elegant and simple (i.e. not my design, see "AUTHOR" below for the acknowledgement!).

On the wire, if we switch off most of the Data::Serializer magic and set the Serializer to YAML, then it looks like this:

---
CALL: localtime
ARGS: []
...
--- 
RESPONSE: Sun Jul  8 21:57:28 2007
...

Note that ... is the record separator which tells the server when it can process the incoming request. In this example, the CALL was for a method called localtime, and there were no arguments so I passed an empty list. The RESPONSE was just the scalar output of Perl's localtime function.

You might find it interesting to note that, inside of RPC::Serialized, the methods used to send and receive data at the client and server are identical. In the example above, I entered the YAML document and trailing ... to make the method call, and the server responded with another YAML document and ... record terminator. Using one of the RPC::Serialized::Client family, it would look just the same.

How to write RPC Handlers

First, know that there are three example handlers included in this distribution, so you can just go and look at them if you prefer reading code to documentation! See the modules under RPC::Serialized::Handler::.

So you might have guessed that the first step is to choose your package name for the handler. Each package contains one handler, or put another way, each handler lives in its own package. You can either have the handler's name be derived from the package's name, or set it manually. Using the localtime example from above, this is what the handler looks like:

package RPC::Serialized::Handler::Localtime;

use strict;
use warnings FATAL => 'all';

use base 'RPC::Serialized::Handler';

sub invoke {
    my $self = shift;
    my $time = shift;

    $time = time unless defined $time;
    return scalar localtime($time);
}

1;

There is a "magic" method in the package, called invoke(), and it is this which is called by the RPC server to handle the incoming request.

By default all the servers will take the name of the requested method CALL, convert underscores to :: separators, convert initial letters to uppercase, and try to load a module in the RPC::Serialized::Handler:: namespace. For example, if I called the $c->frobnits_goo handler from a client, the server would try to load a package called RPC::Serialized::Handler::Frobnits::Goo and then call the <invoke()> method within that.

Alternatively you can see the "OPTIONS FOR THIS MODULE" section below to change that default namespace, or have your own mappings between client calls and loaded handler packages (or a mixture of both).

You should expect all arguments passed to your invoke() method to be in a plain perl list (i.e. @_); items in the list may be references to complex data structures. You should return a single scalar value from that method, and nothing else. The scalar value can, however, be a reference to an arbitrarily complex data structure.

SETTING UP A CLIENT

As with the server, you don't need to know a lot about how networked services operate in order to set up the client. However you probably do need to know where your server is listening, to make contact with it!

Therefore, you will need to use the client which corresponds to your server. Please read the manual page for the appropriate module:

RPC::Serialized::Client::INET

Use this client package to communicate with either a NetServer server, or the UCSPI/TCP server.

RPC::Serialized::Client::UNIX

This client will make UNIX domain socket (i.e. local filesystem) connections to the NetServer or UCSPI/IPC servers.

RPC::Serialized::Client::STDIO

For testing purposes you can use this client, which communicates on Standard Input and Standard Output. Alternatively, use this package as a base to implement a new kind of client.

The basis of the client set-up is given in "SYNOPSIS", above, but we will show another example here for completeness:

#!/usr/bin/perl

use strict;
use warnings FATAL => 'all';

use Readonly;
use RPC::Serialized::Client::UNIX;

Readonly my $SOCKET => '/var/run/rpc-serialized-example.socket';

my $c = RPC::Serialized::Client::UNIX->new({
    io_socket_unix => { Peer => $SOCKET }
});

eval {
    my $response = $c->echo(qw/ a b c d /);
    print "echo: " . join( ":", @$res ) . "\n";
};
warn "$@\n" if $@;

eval {
    my $now = $c->localtime;
    print "Localtime on the server is: $now\n";
};
warn "$@\n" if $@;

The above code uses the UNIX domain socket client to contact a server which is listening on the file mentioned in $SOCKET. Once the client object is set up, we can make a call to any method we wish on the remote server, just by specifying its name. Here, we call the echo and the localtime handlers.

For each call, you should specify the arguments as a plain perl list, although that list can include references to complex data structures. There is a single (scalar) return value from each call you make, although again this may be a reference to a data structure if you wish.

This example also shows how you should use eval{} constructs around the RPC calls. This is good practice for most network programming, as all kinds of things can go wrong. See the section "ERROR HANDLING" below for more information.

Timeouts

You should be aware that RPC::Serialized operates timeouts on all handler calls. By default, you get 30 seconds to make your request (i.e. make the call and pass any data to the server), another 30 seconds for the server to handle the call, and 30 seconds to transfer the response back to your client.

If any of this fails or times out, an exception will be raised and passed back to you if possible. Exceptions can be passed through RPC, but you don't need to know about how that works, only that you should be prepared to handle die using eval.

To alter the timeout setting, see the next section "OPTIONS FOR THIS MODULE". To see examples of the eval construct see "ERROR HANDLING", below.

OPTIONS FOR THIS MODULE

There is actually only a small number of options for this module, as most of the heavy lifting within is done by other modules on the CPAN. In particular, we use Data::Serializer, and there is a section below which explains how to customize your serializer set-up.

Options are passed to the new() method in a hash reference. To see how this is done, take a look at the "GENERAL CONFIGURATION" or "SYNOPSIS" sections above, although here is a quick example:

my $s = RPC::Serialized::Server::NetServer->new({
    rpc_serialized => { timeout => 15, trace => 1 },
});
handler_namespaces

As explained above in "How the server works", the server will try to auto-load the handler for a call, based on some naming conventions. This value sets the Perl package namespaces which are searched. You can set this either to a scalar string, the name of a single namespace, or an array reference containing a list of such namespaces. The default setting is RPC::Serialized::Handler, into which we supply the echo, localtime, and sleep handlers as examples. Setting this value to an array reference containing an empty list will disable auto-loading.

handlers

An alternative to handler_namespaces, this value allows you to map individual CALLs to a given package name. In this way you can alias or alter the "published" name of handlers, or restrict calls. It can be used in addition to handler_namespaces, but will take priority where both can be used to invoke an RPC handler. Set this value to an anonymous hash, where keys are calls made by the client (e.g. echo) and values are package names containing the handler (e.g. RPC::Serialized::Handler::Echo). No handlers are specified by default.

timeout

This is a scalar value which sets how long RPC::Serializer servers wait before timing out their connections. As explained above, it is used when receiving an RPC call, when dispatching to the handler of that call, and when replying to the client. Each phase is given the timeout in seconds to do its work. The default value is 30 seconds.

trace

This is a boolean (scalar) which sets whether logging of the content of RPC traffic is made using UNIX syslog. For more details see the section "SERVER LOGGING", below. If set to a true value, logging will be enabled. The default value is false.

debug

This is a boolean (scalar) which disables the RPC::Serializer magic, meaning just the raw serialized data structures are sent between client and server. Normally, RPC::Serializer will perform compression, encryption, ASCII-armoring and hashing of the data it sends, if so configured. If set to a true value, debug prevents this. It can be very useful when combined with the STDIO client and server, to test operations, as you can type CALLs in by hand at the console. The default value is false.

callbacks

Hash reference with key value pairs of the callback names and the corresponding code reference. Currently only callback pre_handler_argument_filter is working. It will be called after the arguments were encoded from the RPC call and before your RPC method will called. When the callback is called, its input parameters are:

Hash Reference

The contains just one parameter: server and that is the Net::Server::* .object

List of Original RPC Parameters

This is the normal list of parameters for you to filter.

Returned values are the new RPC parameters. In the callback you can modify, add and/or remove parameters. The call is protected by an eval/throw_app construct so the code can die if needed. For example:

my $c = RPC::Serialized::Client::INET->new({
   ... OTHER OPTIONS ...
   callbacks => {
       pre_handler_argument_filter => sub {
           my $opt = shift;
           #   Net::Server::* object:
           #   $opt->{server}
           #   The normal arguments:
           my @arguments = @_;
           #   Return the reversed list of arguments 
           return reverse @arguments;
       },
   }
});

CONFIGURING Data::Serializer

The defaults for Data::Serializer, which is used to encode and decode your method calls and responses, are quite sane so you can safely leave this alone.

However you might prefer to override this and use a particular serialization format, or enable encryption, and this is quite straightforward to do. Passing a hash of options within the call to new() at either the client or server will do this, like so:

my $c = RPC::Serialized::Client::STDIO->new({
    data_serializer => { serializer => 'YAML::Syck', encoding => 'b64' },
});

The only option which you cannot alter is portable, and this is forced to true, meaning that Data::Serializer will ASCII-armor the a data structure (i.e. encode it in hexadecimal or base64). Of course, if you have enabled the debug option to RPC::Serialized then portable is ignored.

In most cases, the Data::Serializer module at the RPC server will auto-detect the settings used, and reply with a packet with the same settings. Where this might not work is in two cases: First make sure that the serializer used on the client is installed on the server. Second, make sure any keys and modules used for encryption on the client are available on the server. With a standard install of RPC::Serialized there should be no concern here, as it uses only core Perl modules, and encryption is not enabled.

For further details please see the Data::Serializer manual page.

CONFIGURING Net::Server

The Net::Server binding shipped with this module has some defaults set, although none are enforced so you can override all options to that module.

The chosen personality is PreFork, and a Single personality is also available. If you want to use something else just copy the bundled binding module (RPC::Serialized::Server::NetServer) and modify as appropriate. Default settings which differ from those in the native Net::Server are as follows:

log_level is set to 4
syslog_facility is set to local1
background is set to undef
setsid is set to undef

This means that logging goes to STDERR from the parent server, but to send it to Syslog instead just do the following (after reading the Net::Server manual page):

my $s = RPC::Serialized::Server::NetServer->new({
    net_server => { log_file => 'Sys::Syslog' },
});

In addition the server does not fork or detach from the shell and go into the background. For further details please see the Net::Server manual page.

AUTHORIZATION

The system from which RPC::Serialized derives supports user-based authorization based on a calling username, the called handler, and the arguments passed to that handler.

In addition, Net::Server supports IP-based access control lists.

Both of these systems are available although by default disabled. Looking in the examples folder with this distribution you should find some sample ACLs for RPC::Serialized. You can also consult the Net::Server manual page for its options.

For the time being the authorization is not documented here, but it is hoped this will be remedied before too long! If you want help with authorization configuration, feel free to email the module author.

SERVER LOGGING

If you have enabled RPC server logging, using the trace option to new(), then output is sent via UNIX Syslog. The server will write out a serialized dump of the data sent and received, using whichever serializer you have set the server to use. This might not be the same serializer used in the transaction, however, as explained in the section "CONFIGURING Data::Serializer", above. You will see the CALL, ARGS, RESPONSE and any EXCEPTIONs raised, in the log.

Logging uses the excellent Log::Dispatch module from CPAN, with its Syslog personality. The default settings are as follows:

name is set to rpc-serialized
min_level is set to info
facility is set to local0
callbacks is set to add a newline to each log message

You can override these settings in the configuration file, or the call to new(), like this:

my $s = RPC::Serialized::Server::STDIO->new({
    log_dispatch_syslog => { facility => 'local7' },
});

Log messages will be dispatched to your syslog subsystem at the level set in min_level. Note that the hash key used is log_dispatch_syslog, as above.

Suppressing Sensitive Data

If you transmit sensitive data in the arguments to handler calls, but also wish to log a trace of the handler call+args, then the args_suppress_log configuration parameter will help.

This parameter takes a Hash reference where they keys are the names of handlers and the values are Array references of sensitive argument names. Naturally, this assumes treating of the args list as a Hash of keys/values by the handler and you would only be able to use this parameter in that situation. For example:

$s = RPC::Serialized::Server::NetServer->new({
    rpc_serialized => { args_suppress_log => {
        login => [qw/ password /],
    }},
});

Using the above configuration, the login handler when called would not log the value of the password named argument in its args. The text [suppressed] is output to the log in place of the named argument's value.

ERROR HANDLING

This module makes use of Exception::Class when it needs to raise a critical error, but don't fret if this makes no sense to you. The essential concept is that calls to this module might die, and you need to be able to deal with that.

The usual way is to wrap calls in an eval{} block to trap errors, like so:

eval {
    my $now = $c->localtime;
    print "Localtime on the server is: $now\n";
};
warn "Remote procedure call failed: $@\n" if $@;

A nifty part of this module (courtesy of the original authors of the code) is that an exception can be raised in the server and delivered to the client. The exceptions are RPC::Serialized::X objects, derived from Class::Exception, of the following types:

RPC::Serialized::X::Protocol is for an RPC protocol error
RPC::Serialized::X::Parse is for a Data::Serializer parser failure
RPC::Serialized::X::Validation is for a data validation error
RPC::Serialized::X::System is for any system error
RPC::Serialized::X::Application is for application programming errors
RPC::Serialized::X::Authorization is for an authorization failure

Typically you want to check if it was RPC::Serialized having a problem, or some other issue:

eval {
    my $num = $c->cabbages;
    print "Number of cabbages is: $num\n";
};
if ($@ and $@->isa('RPC::Serialized::X')) {
    print $@->message, "\n"; # "no handler for cabbages"
}
else { die $@ } # rethrow the exception

For further details please see the Class::Exception manual page.

DIAGNOSTICS

Here is a list of the common error messages and exception types raised by this module, and probable causes:

Failed to create socket: ... in an X::System

The INET or UNIX client has failed to set up an IO::Socket::INET or IO::Socket::UNIX socket respectively.

Invalid or missing CALL in an X::Protocol

After de-serializing the incoming data message from the client, there appears to be no CALL parameter.

Invalid or missing ARGS in an X::Protocol

After de-serializing the incoming data message from the client, there appears to be no ARGS list.

Failed to load ... in an X::System

The server has attempted to load the handler specified for the current call, but failed. Did you specify the correct handler?

No handler for ... in an X::Application

After searching any manual handler mappings, or the auto-load namespaces, no suitable handler package was found for the current call.

... not a RPC::Serialized::Handler in an X::Application

Having found a package to load for the current call from the handler specification, it does not inherit from RPC::Serialized::Handler.

Cannot search for invalid name: ... in an X::Application

You are attempting to auto-load a handler whose package name would not be valid in perl. It must be letters, digits and underscores only.

Invalid or missing CLASS in an X::Protocol

An Exception class rasied by the server is not known to the client, so this exception is raised instead.

Object method called on class in an X::Application

You are attempting to invoke a call on the client module directly, rather than instantiating a new client object from it and then making the call on that.

Missing or invalid input handle in an X::Application

The server has not been passed a valid IO::Handle upon which to read data. The handle is passed in the call to new() or via the ifh accessor method on the server object.

Missing or invalid output handle in an X::Application

The server has not been passed a valid IO::Handle upon which to write data. The handle is passed in the call to new() or via the ofh accessor method on the server object.

Failed to load Log::Dispatch but trace is on: ... in an X::Application

You have enabled server logging using the trace option, but the Log::Dispatch or Log::Dispatch::Syslog module has failed to load.

Data not a hash reference in an X::Protocol

After de-serializing some data (from the client or server), the data structure appears not to be a hash reference.

Failed to send data: ... in an X::System

A system error has ocurred when sending data through the handle to the client or server.

Failed to read data: ... in an X::System

A system error has ocurred when reading data from the handle to the client or server.

Data::Serializer error: ... in an X::Protocol

An error has been thrown by the Data::Serializer module when initializing.

Serializer parse error in an X::Protocol

An error has been thrown by the Data::Serializer module when attempting to serialize or de-serialize data to or from the client or server.

DEPENDENCIES

In addition to the contents of the standard Perl 5.8.4 distribution, this module requires the following:

Data::Serializer
Exception::Class
Module::MultiConf
Readonly
Class::Accessor::Fast::Contained

To use some optional features, you may require the following:

Net::Server
Log::Dispatch
GDBM_File

THANKS

This module is a derivative of YAML::RPC, written by pod and Ray Miller, at the University of Oxford Computing Services. Without their brilliant creation this system would not exist.

AUTHOR

Oliver Gorwits <oliver@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by University of Oxford.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.