The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

ETL::Pipeline::Output::Memory - Save records in memory

SYNOPSIS

  # Save the records into a giant list.
  use ETL::Pipeline;
  ETL::Pipeline->new( {
    input   => ['UnitTest'],
    mapping => {First => 'Header1', Second => 'Header2'},
    output  => ['Memory']
  } )->process;

  # Save the records into a hash, keyed by an identifier.
  use ETL::Pipeline;
  ETL::Pipeline->new( {
    input   => ['UnitTest'],
    mapping => {First => 'Header1', Second => 'Header2'},
    output  => ['Memory', key => 'First']
  } )->process;

DESCRIPTION

ETL::Pipeline::Output::Memory writes the record into a Perl data structure, in memory. The records can be accessed later in the same script.

This output destination comes in useful when processing multiple input files.

Internal data structure

ETL::Pipeline::Output::Memory offers two ways of storing the records - in a hash or in a list. ETL::Pipeline::Output::Memory automatically chooses the correct one depending on the "key" attribute.

When "key" is set, ETL::Pipeline::Output::Memory saves the records in a hash, keyed by the given field. This allows for faster look-up. Use "key" when the record has an identifier.

When "key" is not set, ETL::Pipeline::Output::Memory saves the record in a list. The list saves records unordered - first in first out.

METHODS & ATTRIBUTES

Arguments for "output" in ETL::Pipeline

key

The "current" record is stored as a Perl hash. key is the output destination field name that ties this record with whatever other data you have. In short, key is the identifier field.

key can be blank. In that case, records are stored in "list", unsorted.

Called from "process" in ETL::Pipeline

write_record

write_record copies the contents of "current" into "hash" or "list", saving the record into memory. You can retrieve the records later using the "with_id" or "records" methods.

configure

configure doesn't actually do anything. But it is required by "process" in ETL::Pipeline.

finish

finish doesn't actually do anything. But it is required by "process" in ETL::Pipeline.

Other methods and attributes

with_id

with_id returns a list of records for the given key. Pass in a value for the key and with_id returns an array reference of records.

with_id only works if the "key" attribute was set.

hash

hash is a hash reference used when "key" is set. The key is the value of the field idnetified by "key". The value is an array reference. The array contains all of the records with that same key.

records

records returns a list of hash references. Each hash reference is one data record. records only works when the "key" attribute is blank.

list

list is an array reference that stores records. The records are saved in same order as they are read from the input source. Each list element is a hash reference (the record).

default_fields

Initialize "current" for the next record.

SEE ALSO

ETL::Pipeline, ETL::Pipeline::Output, ETL::Pipeline::Output::Storage::Hash

AUTHOR

Robert Wohlfarth <robert.j.wohlfarth@vanderbilt.edu>

LICENSE

Copyright 2016 (c) Vanderbilt University

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.