NAME
GRID::Machine::Group - Parallel Computing via SSH
SYNOPSIS
use GRID::Machine;
use GRID::Machine::Group;
use List::Util qw(sum);
my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
my @m = map { GRID::Machine->new(host => $_, wait => 5, survive => 1) } @MACHINE_NAMES;
my $c = GRID::Machine::Group->new(cluster => [ @m ]);
$c->sub(suma_areas => q{
my ($id, $N, $np) = @_;
my $sum = 0;
for (my $i = $id; $i < $N; $i += $np) {
my $x = ($i + 0.5) / $N;
$sum += 4 / (1 + $x * $x);
}
$sum /= $N;
});
my ($N, $np, $pi) = (1000, 4, 0);
my @args = map { [$_, $N, $np] } 0..$np-1;
print sum($c->suma_areas(args => \@args)->Results)."\n";
DESCRIPTION
This module provides Parallel Remote Procedure Calls (RPC) to a cluster of machines via a SSH connection. The result of a remote call is a GRID::Machine::Group::Result object.
The Constructor new
The simplest call looks like:
my $machine = GRID::Machine::Group->new(cluster => [ array of GRID::Machine objects]);
This function returns a new instance of an object. The object is blessed in a unique class that inherits from GRID::Machine::Group
. That is, the new object is a singleton. When later the cluster object GRID::Machine::Group
is provided with new methods, those are installed in the singleton class.
The sub
method is used to install a parallel method in the cluster object. Such method is installed in all the machines belonging to the cluster:
$c->sub(suma_areas => q{
my ($id, $N, $np) = @_;
my $sum = 0;
for (my $i = $id; $i < $N; $i += $np) {
my $x = ($i + 0.5) / $N;
$sum += 4 / (1 + $x * $x);
}
$sum /= $N;
});
Once a method like suma_area
is installed, it can be called as a method of the GRID::Machine::Group
object. Consider the following variant of the SYNOPSIS example:
$ cat -n pi8.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use Data::Dumper;
6
7 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
8 my @m = map { GRID::Machine->new(host => $_, wait => 5, survive => 1) } @MACHINE_NAMES;
9
10 my $c = GRID::Machine::Group->new(cluster => [ @m ]);
11
12 $c->sub(suma_areas => q{
13 my ($id, $N, $np) = @_;
14
15 my $sum = 0;
16 for (my $i = $id; $i < $N; $i += $np) {
17 my $x = ($i + 0.5) / $N;
18 $sum += 4 / (1 + $x * $x);
19 }
20 $sum /= $N;
21 });
22
23 my ($N, $np, $pi) = (1000, 4, 0);
24
25 print Dumper($c->suma_areas(args => [ map { [$_, $N, $np] } 0..$np-1 ]));
The call
$c->suma_areas(args => [ map { [$_, $N, $np] } 0..$np-1 ])
is executed in parallel using the master-worker paradigm. As soon as a worker machine is free it executes - via SSH - a call to the sub suma_areas
with one of the arguments specified in the args
array. Here is an example of execution of the former program with two machines
~/grid-machine/examples$ echo $MACHINES
machine1 machine2
and four arguments (i.e. four tasks, see the definition of $np
at line 23 above):
~/grid-machine/examples$ perl pi8.pl
$VAR1 = bless( [
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ '0.786147935324536' ]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ '0.785648435012036' ]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ '0.785148433449528' ]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ '0.784647933137027' ]
}, 'GRID::Machine::Result' )
], 'GRID::Machine::Group::Result' );
The GRID::Machine::Group::Result object is an array of GRID::Machine::Result objects. The results appear in the same order of the call. That is, the first result corresponds to the first argument, the second to the second and so on.
The cluster argument of the constructor is either an array of GRID::Machines or strings. In the second case, the string specifies the host names. See the following example:
~/grid-machine/examples$ echo $MACHINES
machine1 machine2
$ cat -n pi2.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use List::Util qw(sum);
6
7 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
8 my $c = GRID::Machine::Group->new(cluster => [ @MACHINE_NAMES ]);
9
10 $c->sub(suma_areas => q{
11 my ($id, $N, $np) = @_;
12
13 my $sum = 0;
14 for (my $i = $id; $i < $N; $i += $np) {
15 my $x = ($i + 0.5) / $N;
16 $sum += 4 / (1 + $x * $x);
17 }
18 $sum /= $N;
19 });
20
21 my ($N, $np, $pi) = (1000, 4, 0);
22
23 my @args = map { [$_, $N, $np] } 0..$np-1;
24
25 print sum($c->suma_areas( args => \@args )->Results)."\n";
The Mapping of Tasks to Machines
When executing a number of tasks on a GRID::Machine::Group with p
machines, the following property holds:
The first
p
tasks are always mapped in machine order: the first task is executed in machine 0, the second in machine 1, etc until thep-1
task which is executed in machinep-1
..No mapping can be assured for tasks/arguments greater than
p
. For tasks afterp
, tasks are allocated in the first idle machine
The sub
method
The sub
method installs the specified method on each of the machines in the cluster. Once the method is installed via sub
it can be called as a a method of the GRID::Machine::Group object:
$ cat -n smallpar5.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use Data::Dumper;
4
5 use GRID::Machine;
6 use GRID::Machine::Group;
7
8 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
9 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES;
10
11 my $group = GRID::Machine::Group->new(cluster => \@m);
12
13 my @r = $group->sub(do_something => q{
14 my $a = shift;
15 return { sq => $a*$a }
16 });
17
18 print "Parallel:\n";
19 my $r = $group->do_something( replicate => sub { 1+$_->logic_id } );
20
21 print Dumper($r);
22
23 print "Sequentially:\n";
24 for (@m) {
25 my $r = $_->do_something( 1+$_->logic_id );
26 print Dumper($r);
27 }
The replicate
key used in the call at line 19 calls to the sub { 1+$_->logic_id }
as many times as machines has the GRID::Machine::Group object. The former example, when executed in a two machines GRID::Machine::Group object, produces an output like this:
MacBookdeCasiano:examples casiano$ perl smallpar5.pl
Parallel:
$VAR1 = bless( [
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [
{
'sq' => 1
}
]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [
{
'sq' => 4
}
]
}, 'GRID::Machine::Result' )
], 'GRID::Machine::Group::Result' );
Sequentially:
$VAR1 = bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [
{
'sq' => 1
}
]
}, 'GRID::Machine::Result' );
$VAR1 = bless( {
'stderr' => '',
'errmsg' => '',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [
{
'sq' => 4
}
]
}, 'GRID::Machine::Result' );
The makemethod
method
The makemethod
method installs an already existing remote subroutine as a method of the GRID::Machine::Group object. In the following example, a C function named sigma
(lines 10-18) is installed in the remote machines via Inline::C (line 26). Proxy methods on the local side are installed both for the individual machines and the cluster in line 33:
$ cat -n pi6.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use List::Util qw(sum);
6
7 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
8 my $code = << 'EOFUNCTION';
9 double sigma(int id, int N, int np) {
10 double sum = 0;
11 int i;
12 for(i = id; i < N; i += np) {
13 double x = (i + 0.5) / N;
14 sum += 4 / (1 + x * x);
15 }
16 sum /= N;
17 return sum;
18 }
19 EOFUNCTION
20 ;
21
22 my @m = map {
23 GRID::Machine->new(
24 host => $_,
25 wait => 5,
26 uses => [ qq{Inline 'C' => q{$code}} ],
27 survive => 1,
28 )
29 } @MACHINE_NAMES;
30
31 my $c = GRID::Machine::Group->new(cluster => [ @m ]);
32
33 $c->makemethod('sigma');
34
35 my ($N, $np, $pi) = (1000, 4, 0);
36
37 my @args = map { [$_, $N, $np] } 0..$np-1;
38
39 print sum($c->sigma(args => \@args)->Results)."\n";
The call
method
The call
method (see line 23 in the example below) calls some previously installed subroutine via the sub
method (see line 16 below) on an arbitrary number of arguments. As soon as a machine finishes a new task/argument is sent to it.
$ cat -n watchgridmachines2.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use Data::Dumper;
4 use IO::Select;
5
6 use GRID::Machine;
7 use GRID::Machine::Group;
8
9 my @secs = (2, 1, 4, 3, 7, 5, 1, 8..100);
10
11 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
12 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES;
13
14 my $group = GRID::Machine::Group->new(cluster => \@m );
15
16 my @r = $group->sub(do_something => q{
17 my $arg = shift;
18 sleep $arg;
19 SERVER->host().":$arg"
20 });
21
22 my @args = map { [ $secs[$_] ] } 0..2*$#MACHINE_NAMES;
23 my $r = $group->call(do_something => args => \@args);
24
25 print Dumper($r);
The execution of this code produces an output like this:
$ perl watchgridmachines2.pl
$VAR1 = bless( [
bless( {
'stderr' => '',
'errmsg' => 'Operation timed out',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ 'local:2' ]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => 'Operation timed out',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ 'imac:1' ]
}, 'GRID::Machine::Result' ),
bless( {
'stderr' => '',
'errmsg' => 'Operation timed out',
'type' => 'RETURNED',
'stdout' => '',
'errcode' => 0,
'results' => [ 'imac:4' ]
}, 'GRID::Machine::Result' )
], 'GRID::Machine::Group::Result' );
The eval
method
The eval
method (see line 37 in the example below) computes some code on each of the machines belonging to the cluster object. In this example, Inline::C is used in the remote machines to make the computation more efficient:
$ cat -n pi7.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use List::Util qw(sum);
6
7 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
8 my $code = << 'EOFUNCTION';
9 double sigma(int id, int N, int np) {
10 double sum = 0;
11 int i;
12 for(i = id; i < N; i += np) {
13 double x = (i + 0.5) / N;
14 sum += 4 / (1 + x * x);
15 }
16 sum /= N;
17 return sum;
18 }
19 EOFUNCTION
20 ;
21
22 my @m = map {
23 GRID::Machine->new(
24 host => $_,
25 wait => 5,
26 uses => [ qq{Inline 'C' => q{$code}} ],
27 survive => 1,
28 )
29 } @MACHINE_NAMES;
30
31 my $c = GRID::Machine::Group->new(cluster => [ @m ]);
32
33 my ($N, $np, $pi) = (1000, 4, 0);
34
35 my @args = map { [$_, $N, $np] } 0..$np-1;
36
37 my @r = $c->eval(q{ sigma(@_) }, args => \@args)->Results;
38
39 print sum(@r)."\n";
Specifying the Arguments Through an Iterator
Instead of giving the array of arguments via the arg
key, the arguments can be specified giving an iterator. See the call to suma_areas
in line 30::
~/grid-machine/examples$ cat -n pi3.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use List::Util qw(sum);
6
7 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
8 my $c = GRID::Machine::Group->new(cluster => [ @MACHINE_NAMES ]);
9
10 $c->sub(suma_areas => q{
11 my ($id, $N, $np) = @_;
12
13 my $sum = 0;
14 for (my $i = $id; $i < $N; $i += $np) {
15 my $x = ($i + 0.5) / $N;
16 $sum += 4 / (1 + $x * $x);
17 }
18 $sum /= $N;
19 });
20
21 my ($N, $np, $pi) = (1000, 4, 0);
22
23 {
24 my $count = 0;
25 sub next { [$count++, $N, $np] };
26 sub thereare { $count < $np };
27 sub reset { $count = 0 };
28 }
29
30 print sum($c->suma_areas( args => { next => \&next, thereareargs => \&thereare, reset => \&reset } )->Results)."\n";
An iterator is specified passing a hashref with keys next
, thereareargs
and reset
.
next
returns the next itemthereareargs
returnstrue
if the iterator has not finishedreset
initializes the iterator
When executed, the previous program produces an output like this:
~/grid-machine/examples$ perl pi3.pl
3.14159273692313
The replicate
argument
We can use the key
replicate
instead ofargs
to specify the arguments. The only argument is replicated as many times as machines has the GRID::Machine::Group object. The value can be anARRAY
ref or aCODE
ref:$ cat -n smallpar2.pl 1 #!/usr/bin/perl -w 2 use strict; 3 use Data::Dumper; 4 5 use GRID::Machine; 6 use GRID::Machine::Group; 7 8 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES}; 9 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES; 10 11 my $group = GRID::Machine::Group->new(cluster => \@m); 12 13 my @r = $group->sub(do_something => q{ 14 my $a = shift; 15 return { sq => $a*$a } 16 }); 17 18 my $r = $group->do_something( replicate => [ 4 ] ); 19 20 print Dumper($r);
When executed in a Group of two machines, two calls to
do_something
with argument4
are performed in parallel, one per machine. The former program produces:~/grid-machine/examples$ perl smallpar2.pl $VAR1 = bless( [ bless( { 'stderr' => '', 'errmsg' => '', 'type' => 'RETURNED', 'stdout' => '', 'errcode' => 0, 'results' => [ { 'sq' => 16 } ] }, 'GRID::Machine::Result' ), bless( { 'stderr' => '', 'errmsg' => '', 'type' => 'RETURNED', 'stdout' => '', 'errcode' => 0, 'results' => [ { 'sq' => 16 } ] }, 'GRID::Machine::Result' ) ], 'GRID::Machine::Group::Result' );
We can also use a
CODE
ref to specify the argument to replicate (see line 18 below):~/grid-machine/examples$ cat -n smallpar3.pl 1 #!/usr/bin/perl -w 2 use strict; 3 use Data::Dumper; 4 5 use GRID::Machine; 6 use GRID::Machine::Group; 7 8 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES}; 9 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES; 10 11 my $group = GRID::Machine::Group->new(cluster => \@m); 12 13 my @r = $group->sub(do_something => q{ 14 my $a = shift; 15 return { sq => $a*$a } 16 }); 17 18 my $r = $group->do_something( replicate => sub { [ 1+$_->logic_id ]} ); 19 20 print Dumper($r);
when executed, it gives:
~/grid-machine/examples$ perl smallpar3.pl $VAR1 = bless( [ bless( { 'stderr' => '', 'errmsg' => '', 'type' => 'RETURNED', 'stdout' => '', 'errcode' => 0, 'results' => [ { 'sq' => 1 } ] }, 'GRID::Machine::Result' ), bless( { 'stderr' => '', 'errmsg' => '', 'type' => 'RETURNED', 'stdout' => '', 'errcode' => 0, 'results' => [ { 'sq' => 4 } ] }, 'GRID::Machine::Result' ) ], 'GRID::Machine::Group::Result' );
If we want to call some sub with empty arguments we can just call it with argument
replicate => []
:~/grid-machine/examples$ cat -n smallpar1.pl 1 #!/usr/bin/perl -w 2 use strict; 3 use Data::Dumper; 4 5 use GRID::Machine; 6 use GRID::Machine::Group; 7 8 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES}; 9 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES; 10 11 my $group = GRID::Machine::Group->new(cluster => \@m); 12 13 my @r = $group->sub(do_something => q{ `uname -a` }); 14 15 my $r = $group->do_something( replicate => [] ); 16 17 print Dumper($r);
The convenience function
void
exported by GRID::Machine::Group returns the string'replicate => []'
. The previous example is equivalent to this:~/grid-machine/examples$ cat -n smallpar.pl 1 #!/usr/bin/perl -w 2 use strict; 3 use Data::Dumper; 4 5 use GRID::Machine; 6 use GRID::Machine::Group qw{void}; 7 8 my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES}; 9 my @m = map { GRID::Machine->new(host => $_) } @MACHINE_NAMES; 10 11 my $group = GRID::Machine::Group->new(cluster => \@m); 12 13 my @r = $group->sub(do_something => q{ `uname -a` }); 14 15 my $r = $group->do_something( void() ); 16 17 print Dumper($r);
About the pi
Example Used in this Manual
In this manual we have used several times as an example the computation of an approach to number Pi (3.14159...) using numerical integration. To understand it, take into account that the area under the curve 1/(1+x**2) between 0 and 1 is Pi/4 = (3.1415...)/4
as it shows the following debugger session:
pp2@nereida:~/public_html/cgi-bin$ perl -wde 0
main::(-e:1): 0
DB<1> use Math::Integral::Romberg 'integral'
DB<2> p integral(sub { my $x = shift; 4/(1+$x*$x) }, 0, 1);
3.14159265358972
The module Math::Integral::Romberg provides the function integral
that allow us to compute the area of a given function in some interval. In fact - if you remember your high school days - it is easy to see the reason: the integral of 4/(1+$x*$x)
is 4*arctg($x)
and so its area between 0 and 1 is given by:
4*(arctg(1) - arctg(0)) = 4 * arctg(1) = 4 * Pi / 4 = Pi
This is not, in fact, a good way to compute Pi, but makes a good example of how to exploit several machines to fulfill a task.
To compute the area under 4/(1+$x*$x)
we have divided up the interval [0,1]
into sub-intervals of size 1/N
and add up the areas of the small rectangles with base 1/N
and height the value of the curve 4/(1+$x*$x)
in the middle of the interval. The following debugger session illustrates the idea:
pp2@nereida:~$ perl -wde 0
main::(-e:1): 0
DB<1> use List::Util qw(sum)
DB<2> $N = 6
DB<3> @divisions = map { $_/$N } 0..($N-1)
DB<4> sub f { my $x = shift; 4/(1+$x*$x) }
DB<5> @halves = map { $_+0.5/$N } @divisions
DB<6> $area = sum(map { f($_)/$N } @halves)
DB<7> p $area
3.14390742722244
To optimize the execution time we distribute the sum in line 6 $area = sum(map { f($_)/$N } @halves)
among several processes. The processes are numbered from 0 to np-1
. Each process sums up the areas of roughly N/np
intervals. We can spedup the computation if each process is allocated to a different processor (or core).
Performance
The following code is a modification of the canonical example computing pi
. Timers have been included (lines 28 and 31) to see the influence of the number of processors:
$ cat -n pi.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use GRID::Machine;
4 use GRID::Machine::Group;
5 use List::Util qw(sum);
6 use Time::HiRes qw(time gettimeofday tv_interval);
7
8 my @MACHINE_NAMES = split /:+/, ($ENV{MACHINES} || '');
9 @MACHINE_NAMES = ('', '') unless @MACHINE_NAMES;
10
11 my @m = map { GRID::Machine->new(host => $_, wait => 5, survive => 1) } @MACHINE_NAMES;
12
13 my $c = GRID::Machine::Group->new(cluster => [ @m ]);
14
15 $c->sub(suma_areas => q{
16 my ($id, $N, $np) = @_;
17
18 my $sum = 0;
19 for (my $i = $id; $i < $N; $i += $np) {
20 my $x = ($i + 0.5) / $N;
21 $sum += 4 / (1 + $x * $x);
22 }
23 $sum /= $N;
24 });
25
26 my ($N, $np, $pi) = (1e7, 4, 0);
27
28 my @args = map { [$_, $N, $np] } 0..$np-1;
29
30 my $t0 = [gettimeofday];
31 $pi = sum($c->suma_areas(args => \@args)->Results);
32 my $elapsed = tv_interval ($t0);
33 print "Pi = $pi. N = $N Time = $elapsed\n";
When executed in one machine, it takes 5.387676 seconds:
~/grid-machine$ export MACHINES='local'
~/grid-machine$ perl examples/pi.pl
Pi = 3.14159265358978. N = 10000000 Time = 5.059222
When executed in a cluster with two nodes we get a speedup of 1.93 = 5.06/2.62:
~/grid-machine$ export MACHINES='local imac'
~/grid-machine$ perl examples/pi.pl
Pi = 3.14159265358978. N = 10000000 Time = 2.625359
When executed in a machine with two cores, it also has an almost linear speedup:
~/grid-machine$ unset MACHINES
~/grid-machine$ perl examples/pi.pl
Pi = 3.14159265358978. N = 10000000 Time = 2.685503
AUTHOR
Casiano Rodriguez Leon <casiano@ull.es>
ACKNOWLEDGMENTS
This work has been supported by CEE (FEDER) and the Spanish Ministry of Educacion y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.
I wish to thank Paul Evans for his IPC::PerlSSH
module: it was the source of inspiration for this module. To Alex White, Dmitri Kargapolov, Eric Busto and Erik Welch for their contributions. To the Perl Monks, and the Perl Community for generously sharing their knowledge. Finally, thanks to Juana, Coro and my students at La Laguna.
LICENCE AND COPYRIGHT
Copyright (c) 2007 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.
These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.