Hadoop::Streaming::Mapper - Simplify writing Hadoop Streaming jobs. Write a map() and reduce() function and let this role handle the Stream interface.
version 0.100060
#!/usr/bin/env perl package Wordcount::Mapper; use Moose; with 'Hadoop::Streaming::Mapper'; sub map { my ($self, $key, $value) = @_; for (split /\s+/, $value) { $self->emit( $_ => 1 ); } } package main; Wordcount::Mapper->run;
Your mapper class must implement map($key,$value) and your reducer must implement reduce($key,$value). Your classes will have emit() and run() methods added via role.
Package->run();
This method starts the Hadoop::Streaming::Mapper instance.
After creating a new object instance, it reads from STDIN and calls $object->map() on each line of input. Subclasses need only implement map() to produce a complete Hadoop Streaming compatible mapper.
$object->emit( $key, $value )
This method emits a key,value pair in the format expected by Hadoop::Streaming. It does this by calling $self->put(). This catches errors from put and turns them into warnings.
$object->put( $key, $value )
This method emits a key,value pair to STDOUT in the format expected by Hadoop::Streaming: ( key \t value \n )
andrew grangaard <spazm@cpan.org> Naoya Ito <naoya@hatena.ne.jp>
This software is copyright (c) 2010 by Naoya Ito <naoya@hatena.ne.jp>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Hadoop::Streaming::Mapper, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Hadoop::Streaming::Mapper
CPAN shell
perl -MCPAN -e shell install Hadoop::Streaming::Mapper
For more information on module installation, please visit the detailed CPAN module installation guide.