The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Argon

RATIONALE

Argon is a distributed processing platform built for Perl. It is built to provide a simple system for building radically scalable software while at the same time ensuring a high level of robustness and redundancy.

MANAGERS

Managers are entry points into the distributed network. They accept tasks from clients and route them to workers, then deliver the results back to the client.

Managers keep track of the nodes available on the network, ensuring that work is distributed in a balanced and efficient manner to achieve the highest throughput. If a worker becomes unavailable, the load is automatically shifted to the rest of the network. If the worker becomes available again, it will be picked up and the manager will start shifting load to it as necessary.

Managers are started with argon:

    argon manager --port 8000 --host mgrhost

See argon.

WORKERS

Workers are essentially a managed pool of Perl processes. Managers route tasks to workers, who distribute them among their pool of Perl processes, then return the results to the manager (who in turn ensures it gets back to the client).

Once started, the worker notifies the manager that it is available and can immediately start handling tasks as needed. If for any reason the worker loses its connection to the manager, it will attempt to reestablish the connection until it is again in contact with its manager.

Argon workers are uniform. There are no "channels" for individual types of tasks. All workers can handle any type of task. This ensures that no classes of task are starved of resources while other types have underutilized workers.

Workers are started with argon:

    argon worker --port 8001 --host workerhost --manager somehost:8000

By default, a worker will start a number of Perl processes that correlates to the number of CPUs on the system. This can be overridden with the --workers option.

    argon worker --port 8001 --host workerhost --manager somehost:8000 --workers 8

See argon.

CLIENTS

Clients connect to the manager (or, if desired, directly to a "stand-alone" worker that was started without the --manager option). Tasks can be launched in two different ways.

The first method is to send a task and wait for the results. Note that Argon uses Coro, so "waiting" for the result means that the current thread of execution yields until the result is ready, at which point it is awoken.

    use Argon::Client;

    my $client = Argon::Client->new(host => "mgrhost", port => 8000);
    my $result = $client->process(
        # Code to execute
        sub {
            my ($x, $y) = @_;
            return $x + $y;
        },
        # Arguments to pass that code
        [4, 7],
    );

Tasks can also be sent off to the network in the background, allowing the thread of execution to continue until a point where synchronization is required.

    use Argon::Client;

    my $client = Argon::Client->new(host => "mgrhost", port => 8000);

    # Ship the task off and get a function that, when called, waits for
    # the result and returns it.
    my $deferred = $client->defer(
        # Code to execute
        sub {
            my ($x, $y) = @_;
            return $x + $y;
        },
        # Arguments to pass that code
        [4, 7],
    );

    # Synchronize to get the result
    my $result = $deferred->();

Errors thrown in the execution of the task are trapped and re-thrown by the client when the result is returned. In the case of queue that is done when call returns. In the case of defer, it happens when the deferred result is synchronized.

See Argon::Client.

THE EASY WAY

In most cases there will be only a single manager to which a client will wish to connect. Argon::Simple is available for this case:

    use Argon::Simple;

    connect 'mgrhost:8000';

    # Send of some numbers to crunch
    my $result = process { shift * shift } 4, 7;

    # Use the results
    printf "4 * 7 = %d\n", $result->();

The return value of process will synchronize the thread, returning once the task has been completed by the system.

When called in list context, process returns two values: the deferred value as well as a test function that returns true once the task is finished running:

    # Send of some numbers to crunch
    my ($result, $is_finished) = process { shift * shift } 4, 7;

    # Do other things until the work is complete
    do { ... } until $is_finished->();

    # Use the results
    printf "4 * 7 = %d\n", $result->();

SCALABILITY

Argon is designed to make scalability simple and easy. Simply add more workers to boost the resources available to all applications utilizing the network.

Because Argon workers are all uniform, adding a new worker node guarantees a linear boost in resources available to all client applications. For example, given identical tasks on a network with two worker nodes, each running the same number of processes, adding another worker would increase throughput by 50%. Doubling the number of workers would increase throughput by 100%.

HYPERCONGESTION AND PREDICTABLE PERFORMANCE DEGREDATION

Argon managers started with the queue-size switch are protected by a size-limited queue, preventing large backlogs when the system is under high load. When the queue limit is reached, tasks are rejected. The client (Argon::Client) transparently retries rejected tasks with an increasing delay until the system is able to accomodate them (up to a configurable default maximum of 10 retries).

When hit with a spike in traffic, a service can acquire a growing backlog of requests that are beyond its capacity to handle (hypercongestion). The system cannot return to responsiveness until the backlog has been cleared to within the limits of the system's capacity.

Setting a queue limit enforces a defined, maximum load for the system, ensuring that the system remains responsive even when under high load, and protecting other back end services from being flooded with large spikes in requests (e.g. a database server).

The benefit of this strategy is that administrators can predict the degradation to performance in the event of higher than normal traffic and account for it correctly when deciding how to allocate resources.

AUTHOR

Jeff Ober <jeffober@gmail.com>