The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

File::Process::Utils - commonly used recipes for File::Process

SYNOPSIS

 use File::Process::Utils qw(process_csv);

 my $obj = process_csv('foo.csv', has_headers => 1);

DESCRIPTION

Set of utilities that represent some common use cases for File::Process.

METHODS AND SUBROUTINES

process_csv

 process_csv(file, options)

Reads a CSV files using Text::CSV_XS and returns an array of hashes or an array or arrays.

Example:

 my $obj = process_file(
   'foo.csv',
   has_header  => 1,
   csv_options => { sep_char "\t" },
   );
file

Filename or file handle of an open CSV file.

options

List of options described below.

column_names

A list of column names that should be used as the CSV header. These names will be in the keys for the hashes returned.

Note: By setting column_names to an empty array, you can force the return of an array of hashes instead of an array of arrays. The keys will be set the strings col0..col{n-1}.

csv_options

Hash of options that will be passed through to Text::CSV_XS

has_header

Boolean that indicates whether or not the first line of the CSV file should be considred the column titles. These will be used as the hash keys. If has_header is not true, then the first line is considered data and included in the returned array.

Set column_names to an array of strings that will be used as the keys instead in lieu of having a header line. If you do not set column_names and has_header is not true, an array of arrays will be returned instead of an array of hashes.

hooks

An array or hash of subroutines that will be passed each element of a row and should return a transformed value for that element.

If you pass a hash, keys should represent one of the column names you passed in the columns argument or one of the generated keys (col{n}).

If you pass an array, the array should contain a code reference in the index of the array tha that corresponds to the index in the input you wish to process.

  my %hooks = ( col1 => sub { uc shift } );
              
  my $obj = process_csv(
    'foo.csv',
    column_names => [],
    keep_open    => 1,
    csv_options  => { sep_char => "\t" },
    hooks        => \%hooks,
  );

Instead of using hooks, which operate at the column level, you could define your own custom process() method and pass that as an option to process_csv() as all options are passed through to process_file()..

  my $obj = process_csv(
    'foo.csv',
    column_names => [],
    keep_open    => 1,
    csv_options  => { sep_char => "\t" },
    process      => sub {
      my ( $fh, $lines, $args, $row ) = @_;
      $row->{col1} = uc $row->{col1};
      return $row;
    }
  );
keep_open

Boolean that indicates that the file should not be closed after all records are read.

max_rows

Maximum number of rows to process. If undefined, then all lines of the file will be processed.

skip_list

If column names are being used this is hash of keys that will deleted from the returned hash list;

If column names are not being used, skip_list is an array of indexes that will be removed from the returned arrays.

 process_csv(
   'foo.csv',
   has_headers => 1,
   skip_list   => { ssn => 1 }
 );

SEE ALSO

File::Process, Text::ASCIITable::EasyTable

AUTHOR

Rob Lauer - <rlauer6@comcast.net>