The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Tubes::Plugin::Reader

DESCRIPTION

This module contains factory functions to generate tubes that ease reading of input records.

Each of the generated tubes has the following contract:

  • the input record MUST be a hash reference;

  • depending on the presence of an argument input, the record itself or the sub-hash indicated by input MUST contain a field fh with a filehandle. By default, the input field is set to source;

  • one field in the hash (according to factory argument output, set to raw by default) is set to the output of the reading operation.

The factory functions below have two names, one starting with read_ and the other without this prefix. They are perfectly equivalent to each other, whereas the short version can be handier e.g. when using tube or pipeline from Data::Tubes.

FUNCTIONS

by_line

This is a simple wrapper around "by_separator", where the separator argument is forced to be a newline \n.

by_paragraph

This is a simple wrapper around "by_separator", where the separator argument is forced to be the empty string.

by_record_reader

   my $tube = by_record_reader($record_reader, %args); # OR
   my $tube = by_record_reader(%args); # OR
   my $tube = by_record_reader(\%args);

read inputs according to a record reader subroutine.

Accepted arguments are:

emit_eof

when an end-of-file is hit, emit a record with the output field set to undef, so that this condition will be visible in the tubes on the downstream;

identification

you don't normally need to use this... so look at the code in case you have to;

input

name of the input field in the record. If defined and not empty, it points to a sub-hash that will contain a filehandle field fh; otherwise, this fh field MUST be contained directly in the input record contents. Defaults to source;

name

name of the tube, for easier debugging;

output

name of the output field. The output record is ALWAYS a hash reference, containing the input record and the output correponding to this key. Defaults to raw;

record_reader

a sub reference that takes a filehandle as the only input parameter, and returns whatever is read. This is the main parameter, so it can also be provided as the first unnamed argument when calling this factory function. It has no default and is required.

by_separator

   my $tube = by_separator($separator, %args); # OR
   my $tube = by_separator(%args); # OR
   my $tube = by_separator(\%args);

read inputs setting a separator string (a-la INPUT_RECORD_SEPARATOR, see perlvar).

Accepted arguments are:

chomp

apply the chomp function before emitting what's read. Defaults to a true value;

emit_eof

when an end-of-file is hit, emit a record with the output field set to undef, so that this condition will be visible in the tubes on the downstream. Defaults to a false value;

identification

you don't normally need to use this... so look at the code in case you have to;

input

name of the input field in the record. If defined and not empty, it points to a sub-hash that will contain a filehandle field fh; otherwise, this fh field MUST be contained directly in the input record contents. Defaults to source;

name

name of the tube, for easier debugging;

output

name of the output field. The output record is ALWAYS a hash reference, containing the input record and the output correponding to this key. Defaults to raw;

separator

a separator string to set as INPUT_RECORD_SEPARATOR, see perlvar. This parameter defaults to undef. It is the main parameter, so it can also be provided as the first unnamed argument when calling this factory function.

read_by_line

Alias for "by_line".

read_by_paragraph

Alias for "by_paragraph".

read_by_record_reader

Alias for "by_record_reader".

read_by_separator

Alias for "by_separator".

BUGS AND LIMITATIONS

Report bugs either through RT or GitHub (patches welcome).

AUTHOR

Flavio Poletti <polettix@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2016 by Flavio Poletti <polettix@cpan.org>

This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.