ETL::Pipeline::Output::Memory - Store records in memory
# Save the records into a giant list. use ETL::Pipeline; ETL::Pipeline->new( { input => ['UnitTest'], mapping => {First => 'Header1', Second => 'Header2'}, output => ['Memory'] } )->process; # Save the records into a hash, keyed by an identifier. use ETL::Pipeline; ETL::Pipeline->new( { input => ['UnitTest'], mapping => {First => 'Header1', Second => 'Header2'}, output => ['Memory', key => 'First'] } )->process;
ETL::Pipeline::Output::Memory writes the record into a Perl data structure, in memory. The records can be accessed later in the same script. This output destination comes in useful when processing multiple input files.
ETL::Pipeline::Output::Memory offers two ways of storing the records - in a hash or in a list. ETL::Pipeline::Output::Memory always put records into the list. If the "key" attribute is set, then ETL::Pipeline::Output::Memory also saves records into the hash.
The hash can be used for faster look-up. Use "key" when the record contains an identifier.
Optional. If you want to store the records in a hash, then this is the field name whose value becomes the key. When set, records go into "hash".
If you don't specify a key, then records are stored in an unsorted array - "list".
Hash reference used when "key" is set. The key is the value of the field identified by "key". The value is an array reference. The array contains all of the records with that same key.
list is an array reference that stores records. The records are saved in same order as they are read from the input source. Each list element is a hash reference (the record).
list always has a complete set of records, whether "key" is set or not.
This method doesn't do anything. There's nothing to close or shut down.
Count of unique identifiers. This may not be the same as the number of records. One key may have multiple records.
number_of_ids only works if the "key" attribute was set.
Count of records currently in storage.
This method doesn't do anything. There's nothing to open or setup.
Returns a list of all the records currently in storage. The list contains hash references - one reference for each record.
with_id returns a list of records for a given key. Pass in a value for the key and with_id returns an array reference of records.
with_id only works if the "key" attribute was set.
Save the current record into memory. Your script can access the records after calling "process" in ETL::Pipeline like this - $etl-output->records>. Both "records" and "with_id" can be used.
$etl-
If "key" is set, write saves the record in both "hash" and "list". We're storing a reference, not a copy, so there's very little cost. And it allows methods such as "number_of_records" to work.
WARNING: This method stores a reference to the original record. If the input source re-uses the hash or embedded references, it will update all of the currently stored values too. ETL::Pipeline::Output::Memory does not make a copy.
ETL::Pipeline, ETL::Pipeline::Output
Robert Wohlfarth <robert.j.wohlfarth@vumc.org>
Copyright 2021 (c) Vanderbilt University
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install ETL::Pipeline, copy and paste the appropriate command in to your terminal.
cpanm
cpanm ETL::Pipeline
CPAN shell
perl -MCPAN -e shell install ETL::Pipeline
For more information on module installation, please visit the detailed CPAN module installation guide.