MCE::Examples - A list of examples demonstrating Many-core Engine
This document describes MCE::Examples version 1.499_004
MCE comes with various examples showing real-world use case scenarios on parallelizing something as small as cat (try with -n) to searching for patterns and word count aggregation.
barrier_sync.pl A barrier sync demonstration. cat.pl Concatenation script, similar to the cat binary. egrep.pl Egrep script, similar to the egrep binary. wc.pl Word count script, similar to the wc binary. findnull.pl A parallel driven script to report lines containing null fields. It's many times faster than the binary egrep command. Try against a large file containing very long lines. foreach.pl, forseq.pl, forchunk.pl These take the same sqrt example from Parallel::Loops and measures the overhead of the engine. The number indicates the size of @input which can be submitted and results displayed in under 1 second. Parallel::Loops is based on Parallel::ForkManager. MCE utilizes a pool of workers. Parallel::Loops: 600 Forking each @input is expensive MCE foreach....: 34,000 Sends result after each @input MCE forseq.....: 70,000 Loops through sequence of numbers MCE forchunk...: 465,000 Chunking reduces overhead interval.pl Demonstration of the interval option appearing in MCE 1.5. matmult/matmult_base.pl, matmult_mce.pl, strassen_mce.pl Various matrix multiplication demonstrations benchmarking PDL, PDL + MCE, as well as parallelizing Strassen's divide-and-conquer algorithm. Also included are 2 plain Perl examples. scaling_pings.pl Perform ping test and report back failing IPs to standard output. seq_demo.pl A demonstration of the new sequence option appearing in MCE 1.3. Run with seq_demo.pl | sort tbray/wf_mce1.pl, wf_mce2.pl, wf_mce3.pl An implementation of wide finder utilizing MCE. As fast as MMAP IO when file resides in OS FS cache. 2x ~ 3x faster when reading directly from disk.
Imagine a long running process and wanting to parallelize an array against a pool of workers. Note: The sequence option can be used if simply wanting to loop through a sequence of numbers in parallel one number at a time.
Below, a callback function for displaying results is used. The logic shows how one can display results immediately while still preserving the output order as if processing serially. The %result hash is a temporary cache to store results for out-of-order replies.
my @input_data = (0 .. 18000 - 1); my $max_workers = 3; my $order_id = 1; my %result; ## Callback function for displaying results. sub display_result { my ($wk_result, $chunk_id) = @_; $result{$chunk_id} = $wk_result; while (1) { last unless (exists $result{$order_id}); printf "i: %d sqrt(i): %f\n", $input_data[$order_id - 1], $result{$order_id}; delete $result{$order_id}; $order_id++; } } ## Compute via MCE. my $mce = MCE->new( input_data => \@input_data, max_workers => $max_workers, chunk_size => 1, user_func => sub { my ($self, $chunk_ref, $chunk_id) = @_; my $wk_result = sqrt($chunk_ref->[0]); MCE->do('display_result', $wk_result, $chunk_id); } ); MCE->run();
## Compute via MCE. Foreach implies chunk_size => 1. my $mce = MCE->new( max_workers => $max_workers ); ## Worker calls code block passing a reference to an array containing ## one item. Use $chunk_ref->[0] to retrieve the single element. MCE->foreach(\@input_data, sub { my ($self, $chunk_ref, $chunk_id) = @_; my $wk_result = sqrt($chunk_ref->[0]); MCE->do('display_result', $wk_result, $chunk_id); });
Chunking reduces overhead many folds. Instead of passing a single item from @input_data, a chunk of $chunk_size is sent to the next available worker. The sequence option can be used as well if simply wanting to loop through a sequence of numbers with chunking applied in parallel.
my @input_data = (0 .. 385000 - 1); my $max_workers = 3; my $chunk_size = 500; my $order_id = 1; my %result; ## Callback function for displaying results. sub display_result { my ($wk_result, $chunk_id) = @_; $result{$chunk_id} = $wk_result; while (1) { last unless (exists $result{$order_id}); my $i = ($order_id - 1) * $chunk_size; foreach ( @{ $result{$order_id} } ) { printf "i: %d sqrt(i): %f\n", $input_data[$i++], $_; } delete $result{$order_id}; $order_id++; } } ## Compute via MCE. my $mce = MCE->new( input_data => \@input_data, max_workers => $max_workers, chunk_size => $chunk_size, user_func => sub { my ($self, $chunk_ref, $chunk_id) = @_; my @wk_result; foreach ( @{ $chunk_ref } ) { push @wk_result, sqrt($_); } MCE->do('display_result', \@wk_result, $chunk_id); } ); MCE->run();
## Compute via MCE. my $mce = MCE->new( max_workers => $max_workers, chunk_size => $chunk_size ); ## Below, $chunk_ref is a reference to an array containing the next ## $chunk_size items from @input_data. MCE->forchunk(\@input_data, sub { my ($self, $chunk_ref, $chunk_id) = @_; my @wk_result; foreach ( @{ $chunk_ref } ) { push @wk_result, sqrt($_); } MCE->do('display_result', \@wk_result, $chunk_id); });
One may specify the sequence option per each task. The following is taken directly from the seq_demo.pl example. Think of the following demonstration as having 3 mini-MCEs running simultaneously in parallel. Chunking can also be configured independently as well.
use MCE; ## Run with seq_demo.pl | sort sub user_func { my ($self, $seq_n, $chunk_id) = @_; my $wid = MCE->wid(); my $task_id = MCE->task_id(); my $task_wid = MCE->task_wid(); if (ref $seq_n eq 'ARRAY') { ## Received the next "chunked" sequence of numbers ## e.g. when chunk_size > 1, $seq_n will be an array ref above foreach (@{ $seq_n }) { printf( "task_id %d: seq_n %s: chunk_id %d: wid %d: task_wid %d\n", $task_id, $_, $chunk_id, $wid, $task_wid ); } } else { printf( "task_id %d: seq_n %s: chunk_id %d: wid %d: task_wid %d\n", $task_id, $seq_n, $chunk_id, $wid, $task_wid ); } } ## Each task can be configured independently. my $mce = MCE->new( user_tasks => [{ max_workers => 2, chunk_size => 1, sequence => { begin => 11, end => 19, step => 1 }, user_func => \&user_func },{ max_workers => 2, chunk_size => 5, sequence => { begin => 21, end => 29, step => 1 }, user_func => \&user_func },{ max_workers => 2, chunk_size => 3, sequence => { begin => 31, end => 39, step => 1 }, user_func => \&user_func }] ); MCE->run(); -- Output task_id 0: seq_n 11: chunk_id 1: wid 1: task_wid 1 task_id 0: seq_n 12: chunk_id 2: wid 2: task_wid 2 task_id 0: seq_n 13: chunk_id 3: wid 1: task_wid 1 task_id 0: seq_n 14: chunk_id 4: wid 2: task_wid 2 task_id 0: seq_n 15: chunk_id 5: wid 1: task_wid 1 task_id 0: seq_n 16: chunk_id 6: wid 2: task_wid 2 task_id 0: seq_n 17: chunk_id 7: wid 1: task_wid 1 task_id 0: seq_n 18: chunk_id 8: wid 2: task_wid 2 task_id 0: seq_n 19: chunk_id 9: wid 1: task_wid 1 task_id 1: seq_n 21: chunk_id 1: wid 3: task_wid 1 task_id 1: seq_n 22: chunk_id 1: wid 3: task_wid 1 task_id 1: seq_n 23: chunk_id 1: wid 3: task_wid 1 task_id 1: seq_n 24: chunk_id 1: wid 3: task_wid 1 task_id 1: seq_n 25: chunk_id 1: wid 3: task_wid 1 task_id 1: seq_n 26: chunk_id 2: wid 4: task_wid 2 task_id 1: seq_n 27: chunk_id 2: wid 4: task_wid 2 task_id 1: seq_n 28: chunk_id 2: wid 4: task_wid 2 task_id 1: seq_n 29: chunk_id 2: wid 4: task_wid 2 task_id 2: seq_n 31: chunk_id 1: wid 5: task_wid 1 task_id 2: seq_n 32: chunk_id 1: wid 5: task_wid 1 task_id 2: seq_n 33: chunk_id 1: wid 5: task_wid 1 task_id 2: seq_n 34: chunk_id 2: wid 6: task_wid 2 task_id 2: seq_n 35: chunk_id 2: wid 6: task_wid 2 task_id 2: seq_n 36: chunk_id 2: wid 6: task_wid 2 task_id 2: seq_n 37: chunk_id 3: wid 5: task_wid 1 task_id 2: seq_n 38: chunk_id 3: wid 5: task_wid 1 task_id 2: seq_n 39: chunk_id 3: wid 5: task_wid 1
Both input_data and sequence options are optional in MCE. One can simply use MCE to parallelize multiple workers. The "do" & "sendto" methods can be used to pass data back to the manager process. One doesn't have to wait until the worker has completed processing to pass data back. Both "do" & "sendto" methods are processed serially by the main process on a first come, first serve basis. All 4 workers run in parallel for the demonstration below.
use MCE; sub report_stats { my ($wid, $msg, $hash_ref) = @_; print "Worker $wid says $msg: ", $hash_ref->{'counter'}, "\n"; } my $mce = MCE->new( max_workers => 4, user_func => sub { my ($self) = @_; my $wid = MCE->wid(); if ($wid == 1) { my %hash = ('counter' => 0); while (1) { $hash{'counter'} += 1; MCE->do('report_stats', $wid, 'Hello there', \%hash); last if ($hash{'counter'} == 4); sleep 2; } } else { my %hash = ('counter' => 0); while (1) { $hash{'counter'} += 1; MCE->do('report_stats', $wid, 'Welcome ...', \%hash); last if ($hash{'counter'} == 2); sleep 4; } } MCE->sendto('stdout', "Worker $wid is exiting\n"); } ); MCE->run; Worker 2 gets there first in 2nd output below. $ ./demo.pl Worker 1 says Hello there: 1 Worker 2 says Welcome ...: 1 Worker 3 says Welcome ...: 1 Worker 4 says Welcome ...: 1 Worker 1 says Hello there: 2 Worker 2 says Welcome ...: 2 Worker 3 says Welcome ...: 2 Worker 1 says Hello there: 3 Worker 2 is exiting Worker 3 is exiting Worker 4 says Welcome ...: 2 Worker 4 is exiting Worker 1 says Hello there: 4 Worker 1 is exiting $ ./demo.pl Worker 2 says Welcome ...: 1 Worker 1 says Hello there: 1 Worker 4 says Welcome ...: 1 Worker 3 says Welcome ...: 1 Worker 1 says Hello there: 2 Worker 2 says Welcome ...: 2 Worker 4 says Welcome ...: 2 Worker 3 says Welcome ...: 2 Worker 2 is exiting Worker 4 is exiting Worker 1 says Hello there: 3 Worker 3 is exiting Worker 1 says Hello there: 4 Worker 1 is exiting
MCE
Mario E. Roy, <marioeroy AT gmail DOT com>
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install MCE, copy and paste the appropriate command in to your terminal.
cpanm
cpanm MCE
CPAN shell
perl -MCPAN -e shell install MCE
For more information on module installation, please visit the detailed CPAN module installation guide.