Data::Tubes::Plugin::Writer
This module contains functions to ease using tubes.
Functions starting with write_ have an equivalent form without this prefix.
write_
my $tube = dispatch_to_files($filename, %args); # OR my $tube = dispatch_to_files(%args); # OR my $tube = dispatch_to_files(\%args);
composition of dispatch from Data::Tubes::Plugin::Plumbing and "to_files", allows handling multiple output channels selected on the base of the contents of the input record. This is the most flexible mechanism available to relate the output channel to the input record, while at the same time taking advantage of automatic handling of output segmentation into multiple files (as provided by "to_files").
dispatch
Accepts the same arguments as "to_files", although it will always override parameter filename (for obvious reasons!). This parameter can be set to either a sub reference that is supposed to generate a file name or a handle each time it is invoked (as filename_factory) or a string holding a template filename (as filename_template), so it is a handy shortcut for both. For this reason, it is also the default parameter when passed as the first, unnamed option.
filename
filename_factory
filename_template
The function also accepts all options from dispatch in Data::Tubes::Plugin::Plumbing, plus the following ones:
handy shortcut for either filename_factory or filename_template, so this is NOT passed over directly to "to_files";
a sub reference that will emit anything valid for filename in "to_files". It will be fed with the key and the record, see dispatch in Data::Tubes::Plugin::Plumbing for details;
a meta-template string, i.e. a Template::Perlish template that will be expanded based on a hash with the following keys:
key
whatever passed by dispatch in Data::Tubes::Plugin::Plumbing;
record
the current record.
This field is used only if a filename_factory is not available.
The expansion should return anything valid for "to_files".
As an example, suppose you want to generate your filenames based on the key passed by dispatch, and on one additional field foo in the first record for that key. You might have a filename_template like the following:
foo
$template = 'output-[% key %]-[% record.foo %].%03d.txt';
After the expansion, you can get the following templates:
output-bar-whatever.%03d.txt output-baz-yuppie.%03d.txt ...
i.e. templates that can be further expanded according to a policy.
tp_opts
options for Template::Perlish, e.g. if you want to change the delimiters.
An example is due at this point:
my %dtf_tube = dispatch_to_files( # options for `dispatch_to_files` directly filename_template => 'output-[% key %]-%02d.txt', # options for `Data::Tubes::Plugin::Plumbing::dispatch`. This is # used to automatically generate the "key" from the input record, # i.e. the key will be $record->{structured}{class} key => [qw< structured class >], # options for `to_files` policy => { records_threshold => 10 }, header => '{{{', footer => '}}}', );
my $tube = to_files($filename, %args); # OR my $tube = to_files(%args); # OR my $tube = to_files(\%args);
generate a tube for writing to files.
In this context, file is something quite broad, ranging from one single file, to filehandles, to families of files that share a common way to derive their filename.
This factory uses Data::Tubes::Util::Output, so you might want to take a look there too.
The central argument is filename, that can also be set as an initial unnamed parameter in the arguments list. You can set it in different ways:
and this will be used. No operations will be performed on it, apart printing (so, no binmode, no close, etc.)
binmode
close
CORE::open
i.e. a string with the name of a file or a reference to a string;
i.e. a template that is ready for expansion (via sprintffy in Data::Tubes::Util. This is useful if your output should be segmented into multiple files based on a policy (another argument to the factory>, where the name can contain sprintf-like sequences (most notably, %n represents the increasing id of the file, and %02n is the same, but printed in at least two characters and zero-padded);
sprintffy
policy
sprintf
%n
%02n
that is supposed to return either a filehandle or a filename at each call. This is how you can gain maximum flexibility at the expense of more coding on your side.
Most of the times you'll probably be interested in the filename template, so here's an example:
$template = 'my-output-%02d.txt
expands to
my-output-00.txt my-output-01.txt ...
The following expansions are available:
%(\d*)n
expands to the current index for a file, always increasing and starting from 0. The optional digits are handled like an integer expansion in CORE::sprintf;
0
CORE::sprintf
%Y
expands to the year (four digits);
%m
expands to the month (two digits, zero-padded on the left, starting from 1);
%d
expands to the day (two digits, zero-padded on the left, starting from 1);
%H
expands to the hour (two digits, zero-padded on the left, starting from 0);
%M
expands to the minute (two digits, zero-padded on the left, starting from 0);
%S
expands to the second (two digits, zero-padded on the left, starting from 0);
%z
expands to the time zone (in the format [-+]\d\d:\d\d);
[-+]\d\d:\d\d
%D
expands to the date without separators, same as %Y%m%d;
%Y%m%d
%T
expands to the time without separators and including the time zone, same as %H%M%S%z;
%H%M%S%z
%t
expands to the a full timestamp without separators and including the time zone, same as %Y%m%d%H%M%S%z;
%Y%m%d%H%M%S%z
%%
expands to a literal percent sign, in case you were wondering.
NOTE: if you want to put a timestamp, use %t instead of %D and %T. The two expansions will rely on two different calls to CORE::localtime, which means that there is the very slight chance that you might trip over the day change and get the date for the previous day, but the time of the next one, which makes you lose a day. Using %t takes all the variables in one single call, so it always provides a consistent read.
CORE::localtime
If you provide a string filename field that has no expansion, but at the same time set a policy that will lead to generating multiple files, the first file will be called exactly as specified in filename, and the following one will have the name with appended an underscore character and the number (starting from 1) without padding. So, the following filename:
$template = 'my-output.txt'
expands to:
my-output.txt my-output.txt_1 my-output.txt_2 ...
If you don't set a policy, or your thresholds are not hit, then only the first filename will be used of course.
The following arguments are accepted:
value to set via CORE::binmode to opened filehandles (not to provided ones though). See Data::Tubes::Util::Output;
CORE::binmode
see above. Defaults to standard output;
footer
data to be inserted as footer when closing/releasing a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
header
data to be inserted as header when opening/starting to use a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
input
input field in the record. This is what will actually be printed. Defaults to rendered, in compliance with the output of tubes from Data::Tubes::Plugin::Renderer.
rendered
interlude
data to be inserted between records printed out, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
name
name of the tube, useful when debugging;
a policy object where you can set thresholds for limiting the content/size of generated files. See Data::Tubes::Util::Output.
Alias for "to_files".
Report bugs either through RT or GitHub (patches welcome).
Flavio Poletti <polettix@cpan.org>
Copyright (C) 2016 by Flavio Poletti <polettix@cpan.org>
This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
To install Data::Tubes, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Tubes
CPAN shell
perl -MCPAN -e shell install Data::Tubes
For more information on module installation, please visit the detailed CPAN module installation guide.