NAME

Text::AutoCSV - helper module to automate the use of Text::CSV

VERSION

version 1.0.3

SYNOPSIS

By default, Text::AutoCSV will detect the following characteristics of the input:

- The separator, among ",", ";" and "\t" (tab)

- The escape character, among '"' (double-quote) and '\\' (backslash)

- Try UTF-8 and if it fails, fall back on latin1

- Read the header line and compute field names

- If asked to (see "fields_dates_auto"), detect any field that contains a datetime value, trying 20 date formats, possibly followed by a time (6 time formats tested)

- If asked to (see "fields_dates"), detect datetime format of certain fields, croak if no datetime format can be worked out

- Fields identified as containing a datetime value ("fields_dates_auto" or "fields_dates") are stored as DateTime objects by default

Text::AutoCSV also provides methods to search on fields (using cached hash tables) and it can populate the value of "remote" fields, made from joining 2 CSV files with a key-value search

        use Text::AutoCSV;

        Text::AutoCSV->new()->write();    # Read CSV data from std input, write to std output

        Text::AutoCSV->new(in_file => 'f.csv')->write(); # Read CSV data from f.csv, write to std output

                # Read CSV data from f.csv, write to g.csv
        Text::AutoCSV->new(in_file => 'f.csv', out_file => 'g.csv')->write();

        my $csv = Text::AutoCSV->new(in_file => 'zips.csv');
                # Get zipcode of Claix
        my $z = $csv->vlookup('CITY', ' claix    ', 'ZIPCODE');
                # Same as above, but vlookup is strict for case and spaces around
        my $csv = Text::AutoCSV->new(in_file => 'zips.csv', search_case => 1, search_trim => 0);
        my $z = $csv->vlookup('CITY', 'Claix', 'ZIPCODE');

                # Identify column internal names with more flexibility as the default mechanism
        my $csv = Text::AutoCSV->new(in_file => 'zips.csv',
                fields_hr => {'CITY' => '^(city|town)', 'ZIPCODE' => '^zip(code)?$'});
                # Get zipcode of Claix
        my $z = $csv->vlookup('CITY', ' claix    ', 'ZIPCODE');

                # Create field 'MYCITY' made by taking pers.csv' ZIP column value, looking it up in the
                # ZIPCODE columns of zips.csv, taking CITY colmun value and naming it 'MYCITY'. Output is
                # written in std output.
                # If a zipcode is ambiguous, say it.
        Text::AutoCSV->new(in_file => 'pers.csv')
                ->field_add_link('MYCITY', 'ZIP->ZIPCODE->CITY', 'zips.csv',
                        { ignore_ambiguous => 0, value_if_ambiguous => '<duplicate zipcode found!>' })->write();

                # Note the above can also be written using Text::AutoCSV level attributes:
        Text::AutoCSV->new(in_file => 'pers.csv',
                search_ignore_ambiguous => 0, search_value_if_ambiguous => '<duplicate zipcode found!>')
                ->field_add_link('MYCITY', 'ZIP->ZIPCODE->CITY', 'zips.csv')->write();

                # Create 'MYCITY' field as above, then display some statistics
        my $nom_compose = 0;
        my $zip_not_found = 0;
        Text::AutoCSV->new(in_file => 'pers.csv', walker_hr => \&walk)
                ->field_add_link('MYCITY', 'ZIP->ZIPCODE->CITY', 'zips.csv')->read();
        sub walk {
                my $hr = shift;
                $nom_compose++ if $hr->{'NAME'} =~ m/[- ]/;
                $zip_not_found++ unless defined($hr->{'MYCITY'});
        }
        print("Number of persons with a multi-part name: $nom_compose\n");
        print("Number of persons with unknown zipcode: $zip_not_found\n");

        Text::AutoCSV->new(in_file => 'names.csv', out_file => 'ucnames.csv',
                read_post_update_hr => \&updt)->write();
        sub updt { $_[0]->{'LASTNAME'} =~ s/^.*$/\U&/; }

        Text::AutoCSV->new(in_file => 'squares.csv', out_file => 'checkedsquares.csv',
                write_filter_hr => \&wf)->write();
        sub wf {
                return 0 if $_[0]->{'X'} ** 2 != $_[0]->{'SQUARE'};
                return 1;
        }

                # Add a field for the full name, made of the concatenation of the
                # first name and the last name.
                # Also display stats about empty full names.
        Text::AutoCSV->new(in_file => 'dirpeople.csv', out_file => 'dirwithfn.csv', verbose => 1)
                ->field_add_computed('FULLNAME', \&calc_fn)->write();
        sub calc_fn {
                my ($field, $hr, $stats) = @_;
                my $fn = $hr->{'FIRSTNAME'} . ' ' . uc($hr->{'LASTNAME'});
                $stats->{'empty full name'}++ if $fn eq ' ';
                return $fn;
        }

                # Detect any field containing a datetime value and convert it to yyyy-mm-dd whatever the
                # input format is.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', fields_dates_auto => 1,
                out_dates_format => '%F')->write();

                # Detect any field containing a datetime value and convert it to a US datetime whatever the
                # input format is.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', fields_dates_auto => 1,
                out_dates_format => '%b %d, %Y, %I:%M:%S %p', out_dates_locale => 'en')->write();

                # Find dates of specific formats and convert it into yyyy-mm-dd
        Text::AutoCSV->new(in_file => 'raw.csv', out_file => 'cooked.csv',
                dates_formats_to_try => ['%d_%m_%Y', '%m_%d_%Y', '%Y_%m_%d'],
                out_dates_format => '%F')-write();

                # Take the dates on columns 'LASTLOGIN' and 'CREATIONDATE' and convert it into French dates
                # (day/month/year).
                # Text::AutoCSV will croak if LASTLOGIN or CREATIONDATE do not contain a datetime format.
                # By default, Text::AutoCSV will try 20 different formats.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv',
                fields_dates => ['LASTLOGIN', 'CREATIONDATE'], out_dates_format => '%d/%m/%Y')->write();

                # Convert 2 datetime fields into unix standard epoch
                # Write -1 if datetime is empty.
        sub toepoch { return $_->epoch() if $_; -1; }
        Text::AutoCSV->new(in_file => 'stats.csv', out_file => 'stats-epoch.csv',
                fields_dates => ['ATIME', 'MTIME'])
                ->in_map('ATIME', \&toepoch)
                ->in_map('MTIME', \&toepoch)
                ->write();

                # Do the other way round from above: convert 2 fields containing unix standard epoch into a
                # string displaying a human-readable datetime.
        my $formatter = DateTime::Format::Strptime->new(pattern => 'DATE=%F, TIME=%T');
        sub fromepoch {
                return $formatter->format_datetime(DateTime->from_epoch(epoch => $_)) if $_ >= 0;
                '';
        }
        $csv = Text::AutoCSV->new(in_file => 'stats-epoch.csv', out_file => 'stats2.csv')
                ->in_map('ATIME', \&fromepoch)
                ->in_map('MTIME', \&fromepoch)->write();

NAME

Text::AutoCSV - helper module to automate the use of Text::CSV

METHODS

new(\%attr)

(Class method) Returns a new instance of Text::AutoCSV. The object attributes are described by the (optional) hash ref \%attr. Currently the following attributes are available:

Preliminary note about fields_hr, fields_ar and fields_column_names attributes

By default, Text::AutoCSV assumes the input has a header and will use the field values of this first line (the header) to work out the column internal names. These internal names are used everywhere in Text::AutoCSV to designate columns.

The values are transformed as follows:

Any non-alphanumeric character is removed (except underscore) and all letters are switched to upper case. The regex to do this is

        s/[^a-z0-9_]//gi; s/^.*$/\U$&/;

Thus a header line of

        'Office Number 1,Office_2,Personal Number'

will produce the internal column names

        'OFFICENUMBER1' (first column)

        'OFFICE_2' (second column)

        'PERSONALNUMBER' (third column).

The attribute "fields_hr", "fields_ar" or "fields_column_names" (only one of the three is useful at a time) allows to change this behavior.

in_file

The name of the file to read CSV data from.

If not specified or empty, read standard input.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv');
inh

File handle to read CSV data from. Normally you don't want to specify this attribute.

inh is useful if you don't like the way Text::AutoCSV opens the input file for you.

Example:

        open my $inh, "producecsv.sh|";
        my $csv = Text::AutoCSV->new(inh => $inh);
encoding

Comma-separated list of encodings to try to read input.

Note that finding the correct encoding of any given input is overkill. This script just tries encodings one after the other, and selects the first one that does not trigger a warning during reading of input. If all produce warnings, select the first one.

The encoding chosen is used in output, unless attribute "out_encoding" is specified.

Value by default: 'UTF-8,latin1'

IMPORTANT

If one tries something like encoding => 'latin1,UTF-8', it'll never detect UTF-8 as latin1 never triggers warnings during reading. It tends to be also true with encodings like UTF-16 that can remain happy with various inputs (and resulting in Western languages turned into Chinese text).

Ultimately this attribute should be used with a unique value. The result when using more than one value can produce weird results and should be considered experimental.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'w.csv', encoding => 'UTF-16');
via

Normally, Text::AutoCSV checks whether encoding is UTF-8 and if so, it'll append :via(File::BOM) in the binmode. You can alter this behavior and enforce the value of via.

The value should start with a ':' character (Text::AutoCSV won't add one for you).

Value by default:

        ':via(File::BOM)' if encoding is UTF-8
        '' if encoding is not UTF-8

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', via => ':raw:perlio:UTF-32:crlf');
dont_mess_with_encoding

If true, just ignore completely encoding and don't try to alter I/O operations with encoding considerations (using binmode instruction). Note that if inh attribute is specified, then Text::AutoCSV will consider the caller manages encoding for himself and dont_mess_with_encoding will be automatically set, too.

IMPORTANT

This attribute does not mean perl will totally ignore encoding and would consider character strings as bytes for example. The meaning of "dont_mess_with_encoding" is that Text::AutoCSV itself will totally ignore encoding matters, and leave it entirely to Perl' default.

Value by default:

        0 if inh attribute is not set
        1 if inh attribute is set

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', dont_mess_with_encoding => 1);
sep_char

Specify the CSV separator character. Turns off separator auto-detection. This attribute is passed as is to Text::CSV->new().

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', sep_char => ';');
quote_char

Specify the field quote character. This attribute is passed as is to Text::CSV->new().

Value by default: double quote ('"')

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', quote_char => '\'');
escape_char

Specify the escape character. Turns off escape character auto-detection. This attribute is passed as is to Text::CSV->new().

Value by default: backslash ('\\')

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', escape_char => '"');
in_csvobj

Text::CSV object to use. Normally you don't want to specify this attribute.

By default, Text::AutoCSV will manage creating such an object and will work hard to detect the parameters it requires.

Defining in_csvobj attribute turns off separator character and escape character auto-detection.

Using this attribute workarounds Text::AutoCSV philosophy a bit, but you may need it in case Text::AutoCSV behavior is not suitable for Text::CSV creation.

Example:

        my $tcsv = Text::CSV->new();
        my $acsv = Text::AutoCSV->new(in_file => 'in.csv', in_csvobj => $tcsv);
has_headers

If true, Text::AutoCSV assumes the input has a header line.

Value by default: 1

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', has_headers => 0);
fields_hr

(Only if input has a header line) Hash ref that contains column internal names along with a regular expression to find it in the header line. For example if you have:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv',
                fields_hr => {'PHONE OFFICE' => '^office phone nu',
                              'PHONE PERSONAL' => '^personal phone nu'});

And the header line is

        'Personal Phone Number,Office Phone Number'

the column name 'PHONE OFFICE' will designate the second column and the column name 'PHONE PERSONAL' will designate the first column.

You can choose column names like 'Phone Office' and 'Phone Personal' as well.

The regex search is case insensitive.

fields_ar

(Only if input has a header line) Array ref that contains column internal names. The array is used to create a hash ref of the same kind as "fields_hr", by wrapping the column name in a regex. The names are surrounded by a leading '^' and a trailing '$', meaning, the name must match the entire field name.

For example

        fields_ar => ['OFFICENUMBER', 'PERSONALNUMBER']

is strictly equivalent to

        fields_hr => {'OFFICENUMBER' => '^officenumber$', 'PERSONALNUMBER' = '^personalnumber$'}

The regex search is case insensitive.

fields_ar is useful if the internal names are identical to the file column names. It avoids repeating the names over and over as would happen if using L/<fields_hr> attribute.

NOTE

You might wonder why using fields_ar as opposed to Text::AutoCSV default' mechanism. There are two reasons for that:

1- Text::AutoCSV removes spaces from column names, and some people may want another behavior. A header name of 'Phone Number' will get an internal column name of 'PHONENUMBER' (default behavior, if none of fields_hr, fields_ar and fields_column_names attributes is specified), and one may prefer 'PHONE NUMBER' or 'phone number' or whatsoever.

2- By specifying a list of columns using either of fields_hr or fields_ar, you not only map column names as found in the header line to internal column names: you also request these columns to be available. If one of the requested columns cannot be found, Text::AutoCSV will croak (default) or print an error and return an undef object (if created with croak_if_error => 0).

fields_column_names

Array ref of column internal names, in the order of columns in file. This attribute works like the "column_names" attribute of Text::CSV. It'll just assign names to columns one by one, regardless of what the header line contains. It'll work also if the file has no header line.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv',
                fields_column_names => ['My COL1', '', 'My COL3']);
out_file

Output file when executing the "write" method.

If not specified or empty, write to standard output.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv');
outh

File handle to write CSV data to when executing the "write" method. Normally you don't want to specify this attribute.

outh is useful if you don't like the way Text::AutoCSV opens the output file for you.

Example:

        my $outh = open "myin.csv', ">>";
        my $csv = Text::AutoCSV->new(in_file => 'in.csv', has_headers => 0, outh => $outh);
out_encoding

Enforce the encoding of output.

Value by default: input encoding

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv',
                out_encoding => 'UTF-16');
out_utf8_bom

Enforce BOM (Byte-Order-Mark) on output, when it is UTF8. If output encoding is not UTF-8, this attribute is ignored.

NOTE

UTF-8 needs no BOM (there is no Byte-Order in UTF-8), and in practice, UTF8-encoded files rarely have a BOM.

Using this attribute is not recommended. It is provided for the sake of completeness, and also to produce Unicode files Microsoft EXCEL will be happy to read.

At first sight it would seem more logical to make EXCEL happy with something like this:

        out_encoding => 'UTF-16'

But... While EXCEL will identify UTF-16 and read it as such, it will not take into account the BOM found at the beginning. In the end the first cell will have 2 useless characters prepended. The only solution the author knows to workaround this issue if to use UTF-8 as output encoding, and enforce a BOM. That is, use:

        ..., out_encoding => 'UTF-8', out_utf8_bom => 1, ...
out_sep_char

Enforce the output CSV separator character.

Value by default: input separator

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', out_sep_char => ',');
out_quote_char

Enforce the output CSV quote character.

Value by default: input quote character

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', out_quote_char => '"');
out_escape_char

Enforce the output CSV escape character.

Value by default: input escape character

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv',
                out_escape_char_char => '\\');
out_always_quote

If true, quote all fields of output (set always_quote of Text::CSV). If false, don't quote all fields of output (don't set always_quote of Text::CSV).

Value by default: same as what is found in input

While reading input, Text::AutoCSV works out whether or not all fields were quoted. If yes, then the output Text::CSV object has the always_quote attribute set, if no, then the output Text::CSV object does not have this attribute set.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', out_always_quote => 1);
no_undef

If true, non-existent column values are set to an empty string instead of undef. It is also done on extra fields that happen to have an undef value (for example when the target of a linked field is not found).

Note this attribute does not work on callback functions output set with "in_map()": for example empty DateTime values (on fields identified as containing a date/time, see dates_* attributes below) are set to undef, even while no_undef is set. Indeed setting it to an empty string while non-empty values would contain a Datetime object would not be clean. An empty value in a placeholder containing an object must be undef.

In the end this attribute does not produce a consistent result, and it is not recommended to use it.

Value by default: 0

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', no_undef => 1);
read_post_update_hr

To be set to a ref sub. Each time a record is read from input, call read_post_update_hr to update the hash ref of the record. The sub is called with 2 arguments: the hash ref to the record value and the hash ref to stats.

The stats allow to count events and are printed in the end of reading in case Text::AutoCSV is called in verbose mode (verbose => 1).

For example, the read_post_update_hr below will turn column 'CITY' values in upper case and count occurences of empty cities in stat display:

        Text::AutoCSV->new(in_file => 'addresses.csv', read_post_update_hr => \&updt, verbose => 1)
                ->write();
        sub updt {
                my ($hr, $stats) = @_;
                $hr->{'CITY'} =~ s/^.*$/\U$&/;
                $stats->{'empty city encountered'}++ if $hr->{'CITY'} eq '';
        }

IMPORTANT

You cannot create a field this way. To create a field, you have to use the member functions "field_add_link", "field_add_copy" or "field_add_computed".

NOTE

If you wish to manage some updates at field level, consider registering update functions with "in_map" and "out_map" member functions. These functions register callbacks that work at field level and with $_ variable (thus the callback function invoked is AutoCSV-agnostic).

"in_map" updates a field after read, "out_map" updates the field content before writing it.

walker_hr

To set to a sub ref that'll be executed each time a record is read from input. It is executed after "read_post_update_hr". The sub is called with 2 arguments: the hash ref to the record value and the hash ref to stats.

Note "read_post_update_hr" is meant for updating record fields just after reading, whereas "walker_hr" is read-only.

The stats allow to count events and are printed in the end of reading in case Text::AutoCSV is called in verbose mode (verbose => 1). If the "verbose" attribute is not set, the stats are lost.

The example below will count in the stats the number of records where the 'CITY' field is empty. Thanks to verbose => 1 attribute, at the end of reading the stats are displayed.

        my $csv = Text::AutoCSV->new(in_file => 'addresses.csv', walker_hr => \&walk1,
                verbose => 1)->read();
        sub walk1 {
                my ($hr, $stats) = @_;
                $stats->{'empty city'}++ if $hr->{'CITY'} eq '';
        }
walker_ar

To set to a sub ref that'll be executed each time a record is read from input. It is executed after "read_post_update_hr". The sub is called with 2 arguments: the array ref to the record value and the hash ref to stats.

Note "read_post_update_hr" is meant for updating record fields just after reading, whereas walker_hr is read-only.

The stats allow to count events and are printed in the end of reading in case Text::AutoCSV is called in verbose mode (verbose => 1). If the "verbose" attribute is not set, the stats are lost.

The array ref contains values in their natural order in the CSV. To be used with the column names, you have to use "get_fields_names" member function.

The example below will count in the stats the number of records where the 'CITY' field is empty. Thanks to verbose => 1 attribute, at the end of reading the stats are displayed. It produces the exact same result as the example in walker_hr attribute, but it uses walker_ar.

        use List::MoreUtils qw(first_index);
        my $csv = Text::AutoCSV->new(in_file => 'addresses.csv', walker_ar => \&walk2, verbose => 1);
        my @cols = $csv->get_fields_names();
        my $idxCITY = first_index { /^city$/i } @cols;
        die "No city field!??" if $idxCITY < 0;
        $csv->read();
        sub walk2 {
                my ($ar, $stats) = @_;
                $stats->{'empty city'}++ if $ar->[$idxCITY] eq '';
        }
write_filter_hr

To be set to a ref sub. Before writing a record to output, write_filter_hr is called and the record gets writen only if write_filter_hr return value is true. The sub is called with 1 argument: the hash ref to the record value.

For example, if you want to output only records where the 'CITY' column value is Grenoble:

        Text::AutoCSV->new(in_file => 'addresses.csv', out_file => 'grenoble.csv',
                write_filter_hr => \&filt) ->write();
        sub filt {
                my $hr = shift;
                return 1 if $hr->{'CITY'} =~ /^grenoble$/i;
                return 0;
        }
search_case

If true, searches are case sensitive by default. Searches are done by the member functions "search", "search_1hr", "vlookup", and linked fields ("field_add_link").

The search functions can also be called with the option "case", that takes precedence over the object-level search_case attribute value. See vlookup help.

Value by default: 0 (by default searches are case insensitive)

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', search_case => 1);
search_trim

If true, searches ignore the presence of leading or trailing spaces in values.

The search functions can also be called with the option "trim", that takes precedence over the object-level search_trim attribute value. See vlookup help.

Value by default: 1 (by default searches ignore leading and trailing spaces)

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', search_trim => 0);
search_ignore_empty

If true, empty fields are not included in the search indexes.

The search functions can also be called with the option "ignore_empty", that takes precedence over the object-level search_ignore_empty attribute value. See vlookup help.

Value by default: 1 (by default, search of the value '' will find nothing)

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', search_ignore_empty => 0);
fields_dates

Array ref of field names that contain a date.

Once the formats of these fields is known (auto-detection by default), each of these fields will get a specific "in_map" sub that converts the text in a DateTime object and a "out_map" sub that converts back from DateTime to text.

NOTE

The "out_map" given to a datetime field is "defensive code": normally, "in_map" converts text into a DateTime object and "out_map" does the opposite, it takes a DateTime object and converts it to text. If ever "out_map" encounters a value that is not a DateTime object, it'll just stringify it (evaluation in a string context), without calling its DateTime formatter.

If the format cannot be detected for a given field, output an error message and as always when an error occurs, croak (unless "croak_if_error" got set to 0).

Value by default: none

Example:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv',
                fields_dates => ['LASTLOGIN', 'CREATIONDATE']);
fields_dates_auto

Boolean value. If set to 1, will detect dates formats on all fields. Fields in which a datetime format got detected are then managed as if they had been being listed in "fields_dates" attribute: they get an appropriate "in_map" sub and a "out_map" sub to convert to and from DateTime (see "fields_dates" attribute above).

fields_dates_auto looks for datetime on all fields, but it expects nothing: it won't raise an error if no field is found that contains datetime.

Value by default: 0

Example:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv', fields_dates_auto => 1);
dates_formats_to_try

Array ref of string formats.

Text::AutoCSV has a default built-in list of 20 date formats to try and 6 time formats (also it'll combine any date format with any time format).

dates_formats_to_try will replace Text::AutoCSV default format-list will the one you specify, in case the default would not produce the results you expect.

The formats are written in Strptime format.

Value by default (see below about the role of the pseudo-format ''):

        [ '',
        '%Y-%m-%d',
        '%Y.%m.%d',
        '%Y/%m/%d',
        '%m.%d.%y',
        '%m-%d-%Y',
        '%m.%d.%Y',
        '%m/%d/%Y',
        '%d-%m-%Y',
        '%d.%m.%Y',
        '%d/%m/%Y',
        '%m-%d-%y',
        '%m/%d/%y',
        '%d-%m-%y',
        '%d.%m.%y',
        '%d/%m/%y',
        '%Y%m%d%H%M%S',
        '%b %d, %Y',
        '%b %d %Y',
        '%b %d %T %Z %Y',
        '%d %b %Y',
        '%d %b, %Y' ]

IMPORTANT

The empty format (empty string) has a special meaning: when specified, Text::AutoCSV will be able to identify fields that contain only a time (not preceeded by a date).

Note

Format identification is over only when there is no more ambiguity. So the usual pitfall of US versus French dates (month-day versus day-month) gets resolved only when a date is encountered that disambiguates it (a date of 13th of the month or later).

Example with a weird format that uses underscores to separate elements, using either US (month, day, year), French (day, month, year), or international (year, month, day) order:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv',
                dates_formats_to_try => ['%d_%m_%Y', '%m_%d_%Y', '%Y_%m_%d']);
dates_formats_to_try_supp

Same as "dates_formats_to_try" but instead of replacing the default list of formats used during detection, it is added to this default list.

You want to use this attribute if you need a specific datetime format while continuing to benefit from the default list.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv',
                dates_formats_to_try_supp => ['%d_%m_%Y', '%m_%d_%Y', '%Y_%m_%d']);
dates_ignore_trailing_chars

If set to 1, datetime auto-detection will ignore trailing text that may follow detected datetime-like text.

Value by default: 1 (do ignore trailing chars)

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv', dates_ignore_trailing_chars => 0);
dates_search_time

If set to 1, look for times when detecting datetime format. That is, whenever a date format candidate is found, a longer candidate that also contains a time (after the date) is tested.

Value by default: 1 (do look for times when auto-detecting datetime formats)

Example:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv', dates_search_time => 0);
dates_locales

Comma-separated string of locales to test when detecting datetime formats. Ultimately, Text::AutoCSV will try all combinations of date formats, times and locales.

Value by default: none (use perl default locale)

Example:

        my $csv = Text::AutoCSV->new(in_file => 'logins.csv', dates_locales => 'fr,de,en');
dates_zeros_ok

Boolean. If True, a date made only of 0s is regarded as being empty.

For example if dates_zeros_ok is False, then a date like 0000-00-00 will be always incorrect (as the day and month are out of bounds), therefore a format like '%Y-%m-%d' will never match for the field.

Conversely if dates_zeros_ok is True, then a date like 0000-00-00 will be processed as if being empty (''), thus the detection of format will work and when parsed, this "full of zeros" dates will be processed the same way as the empty string (= value will be undef).

IMPORTANT

"0s dates" are evaluated to undef when parsed, thus when converted back to text (out_map), they are set to an empty string, not to the original value.

NOTE

These "0s date" are seen when extracting Microsoft Active Directory (c) content using adfind. It is a hack provided by the author of this module, to avoid passing adfind outputs through special filters.

Value by default: 1

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', dates_zeros_ok => 0);
out_dates_format

Enforce the format of dates in output, for all fields that contain a datetime value.

The format is written in Strptime format.

Value by default: none (by default, use format detected on input)

Example:

                # Detect any field containing a datetime value and convert it to yyyy-mm-dd whatever the
                # input format is.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', fields_dates_auto => 1,
                out_dates_format => '%F')->write();
out_dates_locale

Taken into account only if "out_dates_format" is used.

Sets the locale to apply on "out_dates_format".

Value by default: none (by default, use the locale detected on input)

Example:

                # Detect any field containing a datetime value and convert it to a US datetime whatever the
                # input format is.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv', fields_dates_auto => 1,
                out_dates_format => '%b %d, %Y, %I:%M:%S %p', out_dates_locale => 'en')->write();
croak_if_error

If true, stops the program execution in case of error.

IMPORTANT

Value by default: 1

If set to zero (croak_if_error => 0), errors are displayed as warnings. This printing can then be affected by setting the "quiet" attribute.

verbose

If true, get Text::AutoCSV to be a bit talkative instead of speaking only when warnings and errors occur. Verbose output is printed to STDERR by default, this can be tuned with the "infoh" attribute.

Value by default: 0

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', verbose => 1);
infoh

File handle to display program's verbose output. Has effect only with attribute verbose => 1.

Value by default: \*STDERR

Example:

        open my $infoh, ">", "log.txt";
        my $csv = Text::AutoCSV->new(in_file => 'in.csv', infoh => $infoh);
quiet

If true, don't display warnings and errors, unless croaking.

If "croak_if_error" attribute is set (as per default), still, Text::AutoCSV will produce output (on STDERR) when croaking miserably.

When using croak_if_error => 0, errors are processed as warnings and if "quiet" is set (in addition to "croak_if_error" being set to 0), there'll be no output. Note this way of working is not recommended, as things can go wrong without any notice to the caller.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', quiet => 1);
one_pass

If true, Text::AutoCSV will perform one reading of the input. If other readings are triggered, it'll raise an error and no reading will be done. By default, Text::AutoCSV will croak as is always the case if an error occurs.

Normally Text::AutoCSV will do multiple reads of input to work out certain characteristics of the CSV: guess of encoding and guess of escape character.

Also if member functions like "field_add_link", "field_add_copy", "field_add_computed", "read" or "write" are called after input has already been read, it'll trigger further reads as needed.

If one wishes a unique read of the input to occur, one_pass attribute is to be set.

When true, encoding will be assumed to be the first one in the provided list ("encoding" attribute), if no encoding attribute is provided, it'll be the first one in the default list, to date, it is UTF-8.

When true, and if attribute "escape_char" is not set, escape_char will be assumed to be '\\' (backslash).

By default, one_pass is set if inh attribute is set (caller provides the input file handle of input) or if input file is stdin (in_file attribute not set or set to an empty string).

Value by default:

        0 if inh attribute is not set and in_file attribute is set to a non empty string
        1 if inh attribute is set or in_file is not set or set to an empty string

Example:

        my $csv = Text::AutoCSV->new(in_file => 'in.csv', one_pass => 1);

read()

Read input entirely.

Return value

Returns the object itself in case of success. Returns undef if error.

Callback functions (when defined) are invoked, in the following order:

"read_post_update_hr", intended to do updates on fields values after each record read

"walker_ar", called after each record read, with an array ref of fields values

"walker_hr", called after each record read, with a hash ref of fields values

Example:

                # Do nothing - just check CSV can be read successfully
        Text::AutoCSV->new(in_file => 'in.csv')->read();

write()

Write input into output.

Return value

Returns the object itself in case of success. Returns undef if error.

- If the content is not in-memory at the time write() is called:

Each record is read (with call of "read_post_update_hr", "walker_ar" and "walker_hr") and then written. The read-and-write is done in sequence, each record is written to output before the next record is read from input.

- If the content is in-memory at the time write() is called:

No "read" operation is performed, instead, records are directly written to output.

If defined, "write_filter_hr" is called for each record. If the return value of "write_filter_hr" is false, the record is not written.

Example:

                # Copy input to output.
                # As CSV is parsed in-between, this copy also checks a number of characteristics about the
                # input, as opposed to a plain file copy operation.
        Text::AutoCSV->new(in_file => 'in.csv', out_file => 'out.csv')->write();

field_add_computed($new_field, $subref)

$new_field is the name of the created field.

$subref is a reference to a sub that'll calculate the new field value.

Return value

Returns the object itself in case of success. Returns undef if error.

Add a field calculated from other fields values. The subref runs like this:

        sub func {
                        # $new_field is the name of the field (allows to use one subref for more than one field
                        # calculation).
                        # $hr is a hash ref of fields values.
                        # $stats is a hash ref that gets printed (if Text::AutoCSV is created with verbose => 1)
                        # in the end.
                my ($new_field, $hr, $stats) = @_;

                my $field_value;
                # ... compute $field_value

                return $field_value;
        }

Example:

                # Add a field for the full name, made of the concatenation of the
                # first name and the last name.
        Text::AutoCSV->new(in_file => 'dirpeople.csv', out_file => 'dirwithfn.csv', verbose => 1)
                ->field_add_computed('FULLNAME', \&calc_fn)->write();
        sub calc_fn {
                my ($new_field, $hr, $stats) = @_;
                die "Man, you are in serious trouble!" unless $new_field eq 'FULLNAME';
                my $fn = $hr->{'FIRSTNAME'} . ' ' . uc($hr->{'LASTNAME'});
                $stats->{'empty full name'}++ if $fn eq ' ';
                return $fn;
        }

field_add_copy($new_field, $src_field, $opt_subref)

$new_field if the name of the new field.

$src_field is the name of the field being copied.

$opt_subref is optional. It is a reference to a sub that takes one string (the value of $src_field) and returns a string (the value assigned to $new_field).

Return value

Returns the object itself in case of success. Returns undef if error.

field_add_copy is a special case of "field_add_computed". The advantage of field_add_copy is that it relies on a sub that is Text::AutoCSV "unaware", just taking one string as input and returning another string as output.

IMPORTANT

The current field value is passed to field_add_copy in $_.

A call to

        $csv->field_add_copy($new_field, $src_field, $subref);

is equivalent to

        $csv->field_add_computed($new_field, \&subref2);
        sub subref2 {
                my (undef, $hr) = @_;
                local $_ = $hr->{$src_field};
                return $subref->();
        }

Example of a field copy + pass copied field in upper case and surround content with <<>>:

        my $csv = Text::AutoCSV->new(in_file => 'dirpeople.csv', out_file => 'd2.csv');
        $csv->field_add_copy('UCLAST', 'LASTNAME', \&myfunc);
        $csv->write();
        sub myfunc { s/^.*$/<<\U$&>>/; $_; }

Note that the calls can be chained as most member functions return the object itself upon success. The example above is equivalent to:

        Text::AutoCSV->new(in_file => 'dirpeople.csv', out_file => 'd2.csv')
                ->field_add_copy('UCLAST', 'LASTNAME', \&myfunc)
                ->write();
        sub myfunc { my $s = shift; $s =~ s/^.*$/<<\U$&>>/; return $s; }

field_add_link($new_field, $chain, $linked_file, \%opts)

$new_field is the name of the new field.

$chain is the CHAIN of the link, that is: 'LOCAL->REMOTE->PICK' where:

LOCAL is the field name to read the value from.

REMOTE is the linked field to find the value in. This field belongs to $linked_file.

PICK is the field from which to read the value of, in the record found by the search. This field belongs to $linked_file.

If $new_field is undef, the new field name is the name of the third field of $chain (PICK).

$linked_file is the name of the linked file, that gets read in a Text::AutoCSV object created on-the-fly to do the search on. $linked_file can also be a Text::AutoCSV object that you created yourself, allowing for more flexibility. Example:

        my $lcsv = Text::AutoCSV->new(in_file => 'logins.csv', case => 1);
        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', $lcsv);

\%opts is a hash ref of optional attributes. The same values can be provided as with vlookup.

trim

If set to 1, searches will ignore leading and trailing spaces. That is, a LOCAL value of ' x ' will match with a REMOTE value of 'x'.

If option is not present, use "search_value_if_not_found" attribute of object (default value: 1).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { trim => 0 });
case

If set to 1, searches will take the case into account. That is, a LOCAL value of 'X' will match with a REMOTE value of 'x'.

If option is not present, use "search_case" attribute of object (default value: 0).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { case => 1 });
ignore_empty

If set to 1, empty values won't match. That is, a LOCAL value of '' will not match with a REMOTE value of ''.

If option is not present, use "search_ignore_empty" attribute of object (default value: 1).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { ignore_empty => 0 });
value_if_not_found

If the searched value is not found, the value of the field is undef, that produces an empty string at write time. Instead, you can specify the value.

If option is not present, use "search_value_if_not_found" attribute of object (default value: undef).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { value_if_not_found => '<not found!>' });
value_if_ambiguous

If the searched value is found in more than one record, the value of the field is undef, that produces an empty string at write time. Instead, you can specify the value.

If option is not present, use "search_value_if_ambiguous" attribute of object (default value: undef).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { value_if_ambiguous => '<ambiguous!>' });
ignore_ambiguous

Boolean value. If ignore_ambiguous is True and the searched value is found in more than one record, then, silently fall back on returning the value of the first record. Obviously if ignore_ambiguous is True, then the value of "value_if_ambiguous" is ignored.

If option is not present, use "search_ignore_ambiguous" attribute of object (default value: 1).

Example:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { ignore_ambiguous => 1 });

Example with multiple options:

        $csv->field_add_link('NAME', 'ID->LOGIN->DISPLAYNAME', 'logins.csv',
                { value_if_not_found => '?', ignore_ambiguous => 1 });

Return value

Returns the object itself in case of success. Returns undef if error.

Example of field_add_link usage:

        my $nom_compose = 0;
        my $zip_not_found = 0;
        Text::AutoCSV->new(in_file => 'pers.csv', walker_hr => \&walk)
                ->field_add_link('MYCITY', 'ZIP->ZIPCODE->CITY', 'zips.csv')->read();
        sub walk {
                my $hr = shift;
                $nom_compose++ if $hr->{'NAME'} =~ m/[- ]/;
                $zip_not_found++ unless defined($hr->{'MYCITY'});
        }
        print("Number of persons with a multi-part name: $nom_compose\n");
        print("Number of persons with unknown zipcode: $zip_not_found\n");

join($prefix, $chain, $linked_file, \%opts)

$prefix is the name to add to joined fields

$chain is the JOINCHAIN of the link, that is: 'LOCAL->REMOTE' where:

LOCAL is the field name to read the value from.

REMOTE is the linked field to find the value in. This field belongs to $linked_file.

As opposed to "field_add_link", there is no PICK part, as all fields of target are read.

As opposed to Text::AutoCSV habits of croaking whenever a field name is duplicate, here, the duplicates are resolved by appending _2 to the joined field name if it already exists. If _2 already exists, too, then _3 is appended instead, and so on, until a non-duplicate is found. This mechanism is executed given the difficulty to control all field names when joining CSVs.

$linked_file and \%opts work exactly the same way as for "field_add_link", see "field_add_link" for help.

Return value

Returns the object itself in case of success. Returns undef if error.

Example:

        Text::AutoCSV->new(in_file => 'pers.csv', out_file => 'pers_with_city.csv')
                ->join('Read from zips.csv:', 'ZIP->ZIPCODE', 'zips.csv')->write();

get_current_in_encoding()

Get the string of input encoding, for example 'latin2' or 'UTF-8', etc.

get_in_file_disp()

Get the printable name of in_file

get_sep_char()

Get the string of the input CSV separator character, for example ',' or ';'.

get_escape_char()

Get the string of the input escape character, for example '"' or '\\'.

get_is_always_quoted()

Get 1 if all fields of input are always quoted, 0 otherwise.

get_coldata()

Get an array that describes each column, from the first one (column 0) to the last.

Each element of the array is itself an array ref that contains 5 elements:

        0: Name of the field (as accessed in *_hr functions)
        1: Content of the field in the header line (if input has a header line)
        2: Column content type, shows some meta-data of fields created with field_add_* functions
        3: Datetime format detected, if ever, in the format Strptime
        4: Locale of datetime format detected, if ever

get_pass_count()

Get the number of input readings done. Useful only if you're interested in Text::AutoCSV internals.

get_in_mem_record_count()

Get the number of records currently stored in-memory. Useful only if you're interested in Text::AutoCSV internals.

get_max_in_mem_record_count()

Get the maximum number of records ever stored in-memory. Indeed this number can decrease: certain functions like field_add* member-functions discard in-memory content. Useful only if you're interested in Text::AutoCSV internals.

get_fields_names()

Get an array of the internal names of the columns.

Example:

        my @cols = $csv->get_fields_names();

get_field_name($n)

Get the Nth column name, the first column being number 0.

Example:

                # Get the field name of the third column
        my $col = $csv->get_field_name(2);

get_stats()

Certain callback functions provide a parameter to record event count: "field_add_computed", "read_post_update_hr", "walker_ar" and "walker_hr". By default, these stats are displayed if Text::AutoCSV got created with attribute verbose => 1. get_stats() returns the statistics hash of the object.

Example:

        my %stats = $csv->get_stats();

set_walker_ar($subref)

Normally one wants to define it at object creation time using "walker_ar" attribute. set_walker_ar allows to assign the attribute walker_ar after object creation.

See attribute "walker_ar" for help about the way $subref should work.

Return value

Returns the object itself in case of success. Returns undef if error.

Example:

                # Calculate the total of the two first columns, the first column being money in and the
                # second one being money out.
        my ($actif, $passif) = (0, 0);
        $csv->set_walker_ar(sub { my $ar = $_[0]; $actif += $ar->[0]; $passif += $ar->[1]; })->read();
        print("Actif =  $actif\n");
        print("Passif = $passif\n");

set_walker_hr($subref)

Normally one wants to define it at object creation time using "walker_hr" attribute. set_walker_hr allows to assign the attribute "walker_ar" after object creation.

See attribute "walker_hr" for help about the way $subref should work.

Return value

Returns the object itself in case of success. Returns undef if error.

Example:

        my $csv = Text::AutoCSV->new(in_file => 'directory.csv', verbose => 1);

        # ...

        $csv->set_walker_hr(
                sub {
                        my ($hr, $stat) = @_;
                        $stat{'not capital name'}++, return if $hr->{'NAME'} ne uc($hr->{'NAME'});
                        $stat{'name is capital letters'}++;
                }
        )->read();

set_out_file($out_file)

Normally one wants to define it at object creation time using "out_file" attribute. set_out_file allows to assign the attribute "out_file" after object creation.

Return value

Returns the object itself in case of success. Returns undef if error.

Example:

        $csv->set_out_file('mycopy.csv')->write();

get_keys()

Returns an array of all the record keys of input. A record key is a unique identifier that designates the record.

At the moment it is just an integer being the record number, the first one (that comes after the header line) being of number 0. For example if $csv input is made of one header line and 3 records (that is, a 4-line file typically, if no record contains a line break), $csv->get_keys() returns

        (0, 1, 2)

Example:

        my @allkeys = $csv->get_keys();

                # Allows to get the number of records (but to the expense of storing everything in-memory)
        my $nb_records = () = $csv->get_keys();

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_hr_all()

Returns an array of all record contents of the input, each record being a hash ref.

Example:

        my @the_whole_story = $csv->get_hr_all();

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_row_ar($record_key)

Returns an array ref of the record designated by $record_key.

Example:

                # Get content (as array ref) of last record
        my @allkeys = $csv->get_keys();
        my $lastk = $allkeys[-1];
        my $lastrec_ar = $csv->get_row_ar($lastk);

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_row_hr($record_key)

Returns a hash ref of the record designated by $record_key.

Example:

                # Get content (as hash ref) of first record
        my @allkeys = $csv->get_keys();
        my $firstk = $allkeys[0];
        my $firstrec_hr = $csv->get_row_hr($firstk);

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_cell($record_key, $field_name)

Get the value of the cell designated by its record key and the field name.

Example:

        my @allkeys = $csv->get_keys();
        my $midk = $allkeys[int($#allkeys / 2)];
        my $midname = $csv->get_cell($midk, 'NAME');

Note the above example is close to (but not equivalent, see below):

        my @allkeys = $csv->get_keys();
        my $midk = $allkeys[int($#allkeys / 2)];
        my $midrec_hr = $csv->get_row_hr($midk);
        my $midname = $midrec_hr->{'NAME'};

The difference between the two examples above is that if the field 'NAME' does not exist, the first example will cause a croak (or at least error print out, if "croak_if_error" attribute is set to 0 at Text::AutoCSV object creation time), while the second example will silently set $midname to undef.

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_values($field_name)

Get an array made of the values of the given field name, for every records, in the order of the records.

Example:

        my @logins = $csv->get_values('LOGIN");

This is equivalent to:

        my @allkeys = $csv->get_keys();
        my @logins;
        push @logins, $csv->get_cell($_, 'LOGIN') for (@allkeys);

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

get_recnum()

Returns the current record number, if a reading is in progress. If no read is in progress, return undef.

in_map($field, $subref)

read_update_after($field, $subref)

read_update_after is an alias of in_map.

After reading a record from input, update $field by calling $subref. The value is put in $_. Then the field value is set to the return value of $subref.

This feature is originally meant to manage datetime fields: the input and output CSVs carry text content, and in-between, the values dealt with are DateTime objects.

See "out_map" for an example.

out_map($field, $subref)

write_update_before($field, $subref)

write_update_before is an alias of out_map.

Before writing $field field content into the output file, pass it through out_map. The value is put in $_. Then the return value of $subref is written in the output.

Example:

Suppose you have a CSV file with the convention that a number surrounded by parenthesis is negative. You can register corresponding "in_map" and out_map functions. During the processing of data, the field content will be just a number (positive or negative), while in input and in output, it'll follow the "negative under parenthesis" convention.

In the below example, we rely on convention above and add a new field converted from the original one, that follows the same convention.

        sub in_updt {
                return 0 if !defined($_) or $_ eq '';
                return -$+{inner} if m/^\((?<inner>.*)\)$/;
                $_;
        }
        sub out_updt {
                return '' unless defined($_);
                return '(' . (-$_) . ')' if $_ < 0;
                $_;
        }
        sub convert {
                return ;
        }
        Text::AutoCSV->new(in_file => 'trans-euros.csv', out_file => 'trans-devises.csv')
                ->in_map('EUROS', \&in_updt)
                ->out_map('EUROS', \&out_updt)
                ->out_map('DEVISE', \&out_updt)
                ->field_add_copy('DEVISE', 'EUROS', sub { sprintf("%.2f", $_ * 1.141593); } )
                ->write();

search($field_name, $value, \%opts)

Returns an array ref of all records keys where the field $field_name has the value $value.

\%opts is an optional hash ref of options for the search. See help of "vlookup" about options.

IMPORTANT

An unsuccessful search returns an empty array ref, that is, [ ]. Thus you cannot check for definedness of search return value to know whether or not the search found something.

On the other hand, you can always examine the value search(...)->[0], as search is always an array ref. If the search found nothing, then, search(...)->[0] is not defined.

IMPORTANT bis

If not yet done, this function causes the input to be read entirely and stored in-memory.

Example:

        my $linux_os_keys_ar = $csv->search('OS', 'linux');

search_1hr($field_name, $value, \%opts)

Returns a hash ref of the first record where the field $field_name has the value $value.

\%opts is an optional hash ref of options for the search. See help of "vlookup" about options.

Note the options "value_if_not_found" and "value_if_ambiguous" are ignored. If not found, return undef. If the result is ambiguous (more than one record found) and ignore_ambiguous is set to a false value, return undef.

The other options are taken into account as for any search: "ignore_ambiguous", "trim", "case", "ignore_empty".

IMPORTANT

As opposed to "search", an unsuccessful search_1hr will return undef.

IMPORTANT bis

If not yet done, this function causes the input to be read entirely and stored in-memory.

Example:

        my $hr = $csv->search_1hr('LOGIN', $login);
        my $full_name = $hr->{'FIRSTNAME'} . ' ' . $hr->{'LASTNAME'};

vlookup($searched_field, $value, $target_field, \%opts)

Find the first record where $searched_field contains $value and out of this record, returns the value of $target_field. Returns $value_if_undef if the search is unsuccessful.

\%opts is optional. It is a hash of options for the vlookup:

trim

If set to 1, ignore spaces before and after the values to search.

If option is not present, use "search_trim" attribute of object (default value: 1).

case

If set to 1, do case sensitive searches.

If option is not present, use "search_case" attribute of object (default value: 0).

ignore_empty

If set to 1, ignore empty values in the search. The consequence is that you won't be able to find empty values by searching it.

If option is not present, use "search_ignore_empty" attribute of object (default value: 1).

value_if_not_found

Return value if vlookup finds nothing.

If option is not present, use "search_value_if_not_found" attribute of object (default value: undef).

value_if_ambiguous

Return value if vlookup find more than one result. Tune it only if ignore_ambiguous is unset.

If option is not present, use "search_value_if_ambiguous" attribute of object (default value: undef).

ignore_ambiguous

If set to 1, then if more than one result is found, silently return the first one.

If option is not present, use "search_ignore_ambiguous" attribute of object (default value: 1).

IMPORTANT

If not yet done, this function causes the input to be read entirely and stored in-memory.

Example:

        my $name = $csv->vlookup('LOGIN', $id, 'NAME', { value_if_not_found => '<login not found>' });

AUTHOR

Sébastien Millet <milletseb@laposte.net>

COPYRIGHT AND LICENSE

This software is copyright (c) 2016 by Sébastien Millet.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.