The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

PDL::Parallel::MPI

Routines to allow PDL objects to be moved around on parallel systems using the MPI library.

SYNOPSIS

        use PDL;
        use PDL::Parallel::MPI;
        mpirun(2);

        MPI_Init();
        $rank = get_rank();
        $a=$rank * ones(2);
        print "my rank is $rank and \$a is $a\n";
        $a->move( 1 => 0);
        print "my rank is $rank and \$a is $a\n";
        MPI_Finalize();

MPI STANDARD CALLS

Most of the functions from the MPI standard may be used from this module on regular perl data. This is functionallity inherited from the Parallel::MPI module. Read the documentation for Parallel::MPI to see how to use.

One may mix mpi calls on perl built-in-datatypes and mpi calls on piddles.

        use PDL;
        use PDL::Parallel::MPI;
        mpirun(2);

        MPI_Init();
        $rank = get_rank();
        $pi = 3.1;
        if ($rank == 0) {
                MPI_Send(\$pi,1,MPI_DOUBLE,1,0,MPI_COMM_WORLD);
        } else {
                $message = zeroes(1);
                $message->receive(0);
                print "pi is $message\n";
        }
        MPI_Finalize();

MPI GENERIC CALLS

MPI_Init

Call this before you pass around any piddles or make any mpi calls.

        # usage:
        MPI_Init();

MPI_Finalize

Call this before your mpi program exits or you may get zombies.

        # usage:
        MPI_Finalize();

get_rank

Returns an integer specifying who this process is. Starts at 0. Optional communicator argument.

        # usage
        get_rank($comm);  # comm is optional and defaults to MPI_COMM_WORLD

comm_size

Returns an integer specifying how many processes there are. Optional communicator argument.

        # usage
        comm_size($comm);  # comm is optional and defaults to MPI_COMM_WORLD

mpirun

Typically one would invoke a mpi program using mpirun, which comes with your mpi compiler. This function simply a dirty hack which makes a script to invoke itself using that program.

        mpirun($number_of_processes);

PDL SPECIFIC MPI CALLS

move

move is a piddle method. It copies a piddle from one processes onto another processes. The first arguement is the rank of the source processes, and the second argument is the rank of the receiving processes. The piddle should be the allocated to be same size and datatype on both machines (this is not checked). The method does nothing if executed on a process which is neither the source or the destination. => may be used in place of "," for readability.

        # usage
        $piddle->move($source_processor , $dest_processor);
        # example
        $a = $rank * ones(4);
        $a->move( 0 => 1);

send / receive

You can use send and receive to move piddles around by yourself, although I really recommend using move instead.

        $piddle->send($dest,$tag,$comm);   # $tag and $comm are optional
        $piddle->receive($source,$tag,$comm);  # dido

broadcast

Piddle method which copies the value at the root process to all of the other processes in the communicator. The root defaults to 0 if not specified and the communicator to MPI_COMM_WORLD. Piddles should be pre-allocated to be the same size and datatype on all processes.

        # usage
        $piddle->broadcast($root,$comm);  # $root and $comm optional
        # example
        $a=$rank * ones(4);
        $a->broadcast(3);

send_nonblocking / receive_nonblocking

These piddle methods initiate communication and return before that communication is completed. They return a request object which can be checked for completion or waited on. Data at source and dest should be pre-allocated to have the same size and datatype.

        # $tag and $comm are optional arguments.
        $request = $piddle->send_nonblocking($dest_proc,$tag,$comm);
        $request = $piddle->receive_nonblocking($dest_proc,$tag,$comm);
        ...
        $request->wait();  # blocks until the communication is completed.
                or
        $request->test();  # returns true if the communication is completed.
        # $request is deallocated after a wait or test returns true.
        # this example is similar to how mpi_rotate is implemented.
        $r_send         = $source->send_nonblocking(($rank+1)   % $population);
        $r_receive  = $dest->receive_nonblocking(($rank-1)  % $population);
        $r_receive->wait();  
        $r_send->wait();

get_status / print_status

get_status returns a hashref which contains the status of the last receive. The fields are count, source, tag, and error. print_status simply prints out the status nicely. Note that if there is an error in a receive and exception will be thrown.

        print_status();
        print ${get_status()}{count};

mpi_rotate

mpi_rotate is a piddle method which should be executed at the same time on all processors. For each process, it moves the entire piddle to the next process. This movement is (inefficently) done in place by default, or you can specify a destination.

        $piddle->mpi_rotate(
                dest => $dest_piddle,   # optional
                offset => $offset,      # optional, defaults to +1
        );

scatter

Takes a piddle and splits its data onto all of the processors. This would take an n-dimensional piddle on the root and turn it into an n-1 dimensional piddle on all processors. It may be called as a piddle method, which is equivilant to simply specifing the 'source' argument. On the root, one must specify the source. On all other procs, one may also pass a 'source' argument to allow scatter to grok the size of the destination piddle to allocate. Alternatively on the non-root procs one may specify the dest piddle explicitly, or simply specify the dimensions of the destionation piddle.

        # usage (all arguments are optional, but see above).
        # may be used as a piddle method, which simply sets the
        # source argument.
        
        $dest_piddle = scatter(
                source => $src_piddle,
                dest   => $dest_piddle,  
                dims   => $array_ref,
                root   => $root_proc,      # root defaults to 0.
                comm   => $comm,           # defaults to MPI_COMM_WORLD
        );
        # with 4 processes
        $a = sequence(4,4);
        $b = $a->scatter;

gather

gather is the opposite of scatter. Using it as a piddle method simply specifies the source. If called on an n dimensional piddle on all procs, the root will contain an n+1 dimensional piddle on completion.

                  memory =>
                +------------+                    +-------------+
          ^     |a0          |      ----->        | a0 a1 a2 a3 |
   procs  |     |a1          |     gather         |             |
          |     |a2          |                    |             |
                |a3          |    <-----          |             |
                +------------+    scatter         +-------------+
        # usage
        gather(
                source => $src_piddle,
                dest   => $dest_piddle,  # only used at root, extrapolated from source if not specified.
                root   => $root_proc,    # defaults to 0
                comm   => $comm,         # defaults to MPI_COMM_WORLD
        );
        # example.  assume nprocs == 4.
        $a = ones(4);
        $b = $a->gather;
        # $b->dims now is (4,4) on proc 0.

allgather

allgather does the same thing as gather except that the result is placed on all processors rather than just the root.

                  memory =>
                +------------+                    +-------------+
          ^     |a0          |                    | a0 a1 a2 a3 |
   procs  |     |a1          |     ----->         | a0 a1 a2 a3 |
          |     |a2          |     allgather      | a0 a1 a2 a3 |
                |a3          |                    | a0 a1 a2 a3 |
                +------------+                    +-------------+

alltoall

                  memory =>
                +-------------+                    +-------------+
          ^     | a0 a1 a2 a3 |                    | a0 b0 c0 d0 |
   procs  |     | b0 b1 b2 b3 |     ----->         | a1 b1 c1 d1 |
          |     | c0 c1 c2 c3 |     alltoall       | a2 b2 c2 d2 |
                | d0 d1 d2 d3 |                    | a3 b3 c3 d3 |
                +-------------+                    +-------------+
        # usage
        # calling as piddle method simply sets the source argument.
        $dest_piddle = alltoall(
                source => $src_piddle,
                dest   => $dest_piddle,  # created for you if not passed.
                comm   => $comm,         # defaults to MPI_COMM_WORLD.
        );
        # example: assume comm_size is 4.
        $a = $rank * sequence(4);
        $b = $a->alltoall;

reduce

 +-------------+             +----------------------------------+
 | a0 a1 a2 a3 |    reduce   | a0+b0+c0+d0 , a1+b1+c1+d1, ....  |
 | b0 b1 b2 b3 |    ----->   |                                  |
 | c0 c1 c2 c3 |             |                                  |
 | d0 d1 d2 d3 |             |                                  |
 +-------------+             +----------------------------------+

Allowed operations are: + * max min & | ^ and or xor.

        # usage  (also as piddle method; source is set)
        $dest_piddle = reduce(
                source => $src_piddle,
                dest   => $dest_piddle,  # signifigant only at root & created for you if not specified
                root   => $root,         # defaults to 0
                op     => $op,           # defaults to '+'
                comm   => $comm,         # defaults to MPI_COMM_WORLD
        );
        # example
        $a=$rank * (sequence(4)+1);
        $b=$a->reduce; 

allreduce

Just like reduce except that the result is put on all the processes.

        # usage  (also as piddle method; source is set)
        $dest_piddle = allreduce(
                source => $src_piddle,
                dest   => $dest_piddle,  # created for you if not specified
                root   => $root,         # defaults to 0
                op     => $op,           # defaults to '+'
                comm   => $comm,         # defaults to MPI_COMM_WORLD
        );
        # example
        $a=$rank * (sequence(4)+1);
        $b=$a->allreduce; 

scan

 +-------------+             +----------------------------------+
 | a0 a1 a2 a3 |    scan     | a0 ,  a1 , a2 , a3               |
 | b0 b1 b2 b3 |    ----->   | a0+b0 , a1+b1 , a2+b2, a3+b3     |
 | c0 c1 c2 c3 |             |                                  |
 | d0 d1 d2 d3 |             |    ...                           |
 +-------------+             +----------------------------------+

Allowed operations are: + * max min & | ^ and or xor.

        # usage  (also as piddle method; source is set)
        $dest_piddle = scan(
                source => $src_piddle,
                dest   => $dest_piddle,  # created for you if not specified
                root   => $root,         # defaults to 0
                op     => $op,           # defaults to '+'
                comm   => $comm,         # defaults to MPI_COMM_WORLD
        );
        # example
        $a=$rank * (sequence(4)+1);
        $b=$a->scan; 

reduce_and_scatter

Does a reduce followed by a scatter. A regular scatter distributes the data evenly over all processes, but with reduce_and_scatter you get to specify the distribution (if you want; defaults to uniform).

        # usage  (also as piddle method; source is set)
        $dest_piddle = reduce_and_scatter(
                source => $src_piddle,
                dest   => $dest_piddle,             # created for you if not specified.
                recv_count => $recv_count_piddle    # 1D int piddle.  put $r[$i] elements on proc $i.
                op     => $op,                      # defaults to '+'
                comm   => $comm,                    # defaults to MPI_COMM_WORLD
        );
        # example, taken from t/20_reduce_and_scatter.t
        mpirun(4); MPI_Init(); $rank = get_rank();
        $a=$rank * (sequence(4)+1);
        $b=$a->reduce_and_scatter; 
        print "rank = $rank, b=$b\n" ;

WARNINGS

This module is still under development. Signifigant changes are expected. As things are *expected* to break somehow or another, there is no waranty, and the author does not take responsibility for the damage/distruction of your data/computer/sanity.

PLANS

indexing

Currently there is no support for any sort of indexing/dataflow/child piddles. Ideally one would like to say:

        $piddle->diagonal(0,1)->move(0 => 1);

But currently one must say:

        $tmp = $piddle->diagonal(0,1)->copy;
        $tmp->move( 0 => 1);
        $piddle->diagonal(0,1) .= $tmp;

I believe the former behavior to be possible to implement. I plan to do so once I reach a sufficent degree of enlightenment.

distributed data

One might wish to use their own personal massively parallel supercomputer interactively with the pdl shell (perldl). This would require master/slave interactions and a distributed data model. Such a project won't be started until after I finish PDL::Parallel::OpenMP.

AUTHOR

Darin McGill darin@ocf.berkeley.edu

If you find this module useful, please let me know. I very much appreciate bug reports, suggestions and other feedback.

ACKNOWLEDGEMENTS

This module is an extension of Parallel::MPI written by Josh Wilmes and Chris Stevens. Signifigant portions of code has been copied from Parallel::MPI verbatim. Used with permission. Josh and Chris did most of the work to make perl's built in datatypes (scalars, arrays) work with MPI. I rely heavily on their work for MPI intitialization and error handling.

The diagrams in this document were inspired by and are similar to diagrams found in MPI-The Complete Reference.

Sections of code from the main PDL distribution (such as header files, and code from PDL::CallExt.xs) were used extensively in development. Many thanks to the PDL developers for their help, and of course, for creating the PDL system in the first place.

SEE ALSO

The PDL::Parallel homepage: http://www.ocf.berkeley.edu/~darin/projects/superperl

The PDL home page: http://pdl.perl.org

The Perl module Parallel::MPI.

The Message Passing Interface: http://www.mpi-forum.org/

PDL::Parallel::OpenMP (under development).

COPYING

This module is free software. It may be modified and/or redistributed under the same terms as perl itself.