App::CSVUtils - CLI utilities related to CSV
This document describes version 0.021 of App::CSVUtils (from Perl distribution App-CSVUtils), released on 2019-05-12.
This distribution contains the following CLI utilities:
csv-add-field
csv-avg
csv-concat
csv-convert-to-hash
csv-delete-field
csv-dump
csv-each-row
csv-grep
csv-list-field-names
csv-lookup-fields
csv-map
csv-munge-field
csv-replace-newline
csv-select-fields
csv-select-row
csv-setop
csv-sort
csv-sort-fields
csv-sort-rows
csv-sum
csv2tsv
dump-csv
tsv2csv
Usage:
csv_add_field(%args) -> [status, msg, payload, meta]
Add a field to CSV file.
Your Perl code (-e) will be called for each row (excluding the header row) and should return the value for the new field. $main::row is available and contains the current row. $main::rownum contains the row number (2 means the first data row). $csv is the Text::CSV_XS object. $main::field_idxs is also available for additional information.
$main::row
$main::rownum
$csv
$main::field_idxs
Field by default will be added as the last field, unless you specify one of --after (to put after a certain field), --before (to put before a certain field), or --at (to put at specific position, 1 means as the first field).
--after
--before
--at
This function is not exported.
Arguments ('*' denotes required arguments):
after => str
Put the new field after specified field.
at => int
Put the new field at specific position (1 means as first field).
before => str
Put the new field before specified field.
eval* => str|code
Perl code to do munging.
field* => str
Field name.
filename* => filename
Input CSV file.
header => bool (default: 1)
Whether CSV has a header row.
When you declare that CSV does not have header row (--no-header), the fields will be named field1, field2, and so on.
--no-header
field1
field2
tsv => bool
Inform that input file is in TSV (tab-separated) format instead of CSV.
Returns an enveloped result (an array).
First element (status) is an integer containing HTTP status code (200 means OK, 4xx caller error, 5xx function error). Second element (msg) is a string containing error message, or 'OK' if status is 200. Third element (payload) is optional, the actual result. Fourth element (meta) is called result metadata and is optional, a hash that contains extra information.
Return value: (any)
csv_avg(%args) -> [status, msg, payload, meta]
Output a summary row which are arithmetic averages of data rows.
with_data_rows => bool
Whether to also output data rows.
csv_concat(%args) -> [status, msg, payload, meta]
Concatenate several CSV files together, collecting all the fields.
Example, concatenating this CSV:
col1,col2 1,2 3,4
and:
col2,col4 a,b c,d e,f
col3 X Y
will result in:
col1,col2,col4,col3 1,2, 3,4, ,a,b ,c,d ,e,f ,,,X ,,,Y
filenames* => array[filename]
Input CSV files.
csv_convert_to_hash(%args) -> [status, msg, payload, meta]
Return a hash of field names as keys and first row as values.
row_number => int (default: 2)
Row number (e.g. 2 for first data row).
csv_delete_field(%args) -> [status, msg, payload, meta]
Delete one or more fields from CSV file.
fields* => array[str]
Field names.
csv_dump(%args) -> [status, msg, payload, meta]
Dump CSV as data structure (array of array/hash).
hash => bool
Provide row in $_ as hashref instead of arrayref.
csv_each_row(%args) -> [status, msg, payload, meta]
Run Perl code for every row.
Examples:
Delete user data:
csv_each_row( filename => "users.csv", eval => "unlink qq(/home/data/\$_->{username}.dat)", hash => 1 );
This is like csv_map, except result of code is not printed.
Perl code.
csv_grep(%args) -> [status, msg, payload, meta]
Only output row(s) where Perl expression returns true.
Only show rows where the amount field is divisible by 7:
csv_grep( filename => "file.csv", eval => "\$_->{amount} % 7 ? 1:0", hash => 1);
Only show rows where date is a Wednesday:
csv_grep( filename => "file.csv", eval => "BEGIN { use DateTime::Format::Natural; \$parser = DateTime::Format::Natural->new } \$dt = \$parser->parse_datetime(\$_->{date}); \$dt->day_of_week == 3", hash => 1 );
This is like Perl's grep performed over rows of CSV. In $_, your Perl code will find the CSV row as an arrayref (or, if you specify -H, as a hashref). $main::row is also set to the row (always as arrayref). $main::rownum contains the row number (2 means the first data row). $main::csv is the Text::CSV_XS object. $main::field_idxs is also available for additional information.
grep
$_
-H
$main::csv
Your code is then free to return true or false based on some criteria. Only rows where Perl expression returns true will be included in the result.
csv_list_field_names(%args) -> [status, msg, payload, meta]
List field names of CSV file.
csv_lookup_fields(%args) -> [status, msg, payload, meta]
Fill fields of a CSV file from another.
Example input:
# report.csv client_id,followup_staff,followup_note,client_email,client_phone 101,Jerry,not renewing, 299,Jerry,still thinking over, 734,Elaine,renewing, # clients.csv id,name,email,phone 101,Andy,andy@example.com,555-2983 102,Bob,bob@acme.example.com,555-2523 299,Cindy,cindy@example.com,555-7892 400,Derek,derek@example.com,555-9018 701,Edward,edward@example.com,555-5833 734,Felipe,felipe@example.com,555-9067
To fill up the client_email and client_phone fields of report.csv from clients.csv, we can use: --lookup-fields client_id:id --fill-fields client_email:email,client_phone:phone. The result will be:
client_email
client_phone
report.csv
clients.csv
--lookup-fields client_id:id --fill-fields client_email:email,client_phone:phone
client_id,followup_staff,followup_note,client_email,client_phone 101,Jerry,not renewing,andy@example.com,555-2983 299,Jerry,still thinking over,cindy@example.com,555-7892 734,Elaine,renewing,felipe@example.com,555-9067
count => bool
Do not output rows, just report the number of rows filled.
fill_fields* => str
ignore_case => bool
lookup_fields* => str
source* => filename
CSV file to lookup values from.
target* => filename
CSV file to fill fields of.
csv_map(%args) -> [status, msg, payload, meta]
Return result of Perl code for every row.
Create SQL insert statements (escaping is left as an exercise for users):
csv_map( filename => "file.csv", eval => "INSERT INTO mytable (id,amount) VALUES (\$_->{id}, \$_->{amount});", hash => 1 );
This is like Perl's map performed over rows of CSV. In $_, your Perl code will find the CSV row as an arrayref (or, if you specify -H, as a hashref). $main::row is also set to the row (always as arrayref). $main::rownum contains the row number (2 means the first data row). $main::csv is the Text::CSV_XS object. $main::field_idxs is also available for additional information.
map
Your code is then free to return a string based on some operation against these data. This utility will then print out the resulting string.
add_newline => bool (default: 1)
Whether to make sure each string ends with newline.
csv_munge_field(%args) -> [status, msg, payload, meta]
Munge a field in every row of CSV file.
Perl code (-e) will be called for each row (excluding the header row) and $_ will contain the value of the field, and the Perl code is expected to modify it. $main::row will contain the current row array. $main::rownum contains the row number (2 means the first data row). $main::csv is the Text::CSV_XS object. $main::field_idxs is also available for additional information.
csv_replace_newline(%args) -> [status, msg, payload, meta]
Replace newlines in CSV values.
Some CSV parsers or applications cannot handle multiline CSV values. This utility can be used to convert the newline to something else. There are a few choices: replace newline with space (--with-space, the default), remove newline (--with-nothing), replace with encoded representation (--with-backslash-n), or with characters of your choice (--with 'blah').
--with-space
--with-nothing
--with-backslash-n
--with 'blah'
with => str (default: " ")
csv_select_fields(%args) -> [status, msg, payload, meta]
Only output selected field(s).
field_pat => re
Field regex pattern to select.
fields => array[str]
csv_select_row(%args) -> [status, msg, payload, meta]
Only output specified row(s).
row_spec* => str
Row number (e.g. 2 for first data row), range (2-7), or comma-separated list of such (2-7,10,20-23).
csv_setop(%args) -> [status, msg, payload, meta]
Set operation against several CSV files.
# file1.csv a,b,c 1,2,3 4,5,6 7,8,9 # file2.csv a,b,c 1,2,3 4,5,7 7,8,9
Output of intersection (--intersect file1.csv file2.csv), which will return common rows between the two files:
--intersect file1.csv file2.csv
a,b,c 1,2,3 7,8,9
Output of union (--union file1.csv file2.csv), which will return all rows with duplicate removed:
--union file1.csv file2.csv
a,b,c 1,2,3 4,5,6 4,5,7 7,8,9
Output of difference (--diff file1.csv file2.csv), which will return all rows in the first file but not in the second:
--diff file1.csv file2.csv
a,b,c 4,5,6
Output of symmetric difference (--symdiff file1.csv file2.csv), which will return all rows in the first file not in the second, as well as rows in the second not in the first:
--symdiff file1.csv file2.csv
a,b,c 4,5,6 4,5,7
You can specify --compare-fields to only consider some fields only, for example --union --compare-fields a,b file1.csv file2.csv:
--compare-fields
--union --compare-fields a,b file1.csv file2.csv
a,b,c 1,2,3 4,5,6 7,8,9
Each field specified in --compare-fields can be specified using F1:F2:... format to refer to different field names or indexes in each file, for example if file3.csv is:
F1:F2:...
file3.csv
# file3.csv Ei,Si,Bi 1,3,2 4,7,5 7,9,8
Then --union --compare-fields a:Ei,b:Bi file1.csv file3.csv will result in:
--union --compare-fields a:Ei,b:Bi file1.csv file3.csv
Finally you can print out certain fields using --result-fields.
--result-fields
compare_fields => str
op* => str
Set operation to perform.
result_fields => str
csv_sort_fields(%args) -> [status, msg, payload, meta]
Sort CSV fields.
This utility sorts the order of fields in the CSV. Example input CSV:
b,c,a 1,2,3 4,5,6
Example output CSV:
a,b,c 3,1,2 6,4,5
You can also reverse the sort order (-r), sort case-insensitively (-i), or provides the ordering, e.g. --example a,c,b.
-r
-i
--example a,c,b
ci => bool
example => str
A comma-separated list of field names.
reverse => bool
csv_sort_rows(%args) -> [status, msg, payload, meta]
Sort CSV rows.
This utility sorts the rows in the CSV. Example input CSV:
name,age Andy,20 Dennis,15 Ben,30 Jerry,30
Example output CSV (using --by-fields +age which means by age numerically and ascending):
--by-fields +age
name,age Dennis,15 Andy,20 Ben,30 Jerry,30
Example output CSV (using --by-fields -age, which means by age numerically and descending):
--by-fields -age
name,age Ben,30 Jerry,30 Andy,20 Dennis,15
Example output CSV (using --by-fields name, which means by name ascibetically and ascending):
--by-fields name
name,age Andy,20 Ben,30 Dennis,15 Jerry,30
Example output CSV (using --by-fields ~name, which means by name ascibetically and descending):
--by-fields ~name
name,age Jerry,30 Dennis,15 Ben,30 Andy,20
Example output CSV (using --by-fields +age,~name):
--by-fields +age,~name
name,age Dennis,15 Andy,20 Jerry,30 Ben,30
You can also reverse the sort order (-r), sort case-insensitively (-i), or provides the code (--by-code, for example --by-code '$a->[1] <=> $b->[1] || $b->[0] cmp $a->[0]' which is equivalent to --by-fields +age,~name). If you use --hash, your code will receive the rows to be compared as hashref, e.g. `--hash --by-code '$a->{age} <=> $b->{age} || $b->{name} cmp $a->{name}'.
--by-code
--by-code '$a->[1] <=> $b->[1] || $b->[0] cmp $a->[0]'
--hash
by_code => str|code
Perl code to do sorting.
$a and $b (or the first and second argument) will contain the two rows to be compared.
$a
$b
by_fields => str
A comma-separated list of field sort specification.
+FIELD to mean sort numerically ascending, -FIELD to sort numerically descending, FIELD to mean sort ascibetically ascending, ~FIELD to mean sort ascibetically descending.
+FIELD
-FIELD
FIELD
~FIELD
csv_sum(%args) -> [status, msg, payload, meta]
Output a summary row which are arithmetic sums of data rows.
Please visit the project's homepage at https://metacpan.org/release/App-CSVUtils.
Source repository is at https://github.com/perlancar/perl-App-CSVUtils.
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-CSVUtils
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
csvgrep.
setop.
perlancar <perlancar@cpan.org>
This software is copyright (c) 2019, 2018, 2017, 2016 by perlancar@cpan.org.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install App::CSVUtils, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::CSVUtils
CPAN shell
perl -MCPAN -e shell install App::CSVUtils
For more information on module installation, please visit the detailed CPAN module installation guide.