The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

csvgrep - search for patterns in a CSV and display results in a table

SYNOPSIS

 csvgrep <pattern> <file>
 csvgrep -d <directory> <pattern>

DESCRIPTION

csvgrep is a script that lets you look for a pattern in a CSV file, and then displays the results in a text table. We assume that the first line in the CSV is a header row.

The simplest usage is to look for a word in a CSV:

 % csvgrep Murakami books.csv
 +-------------------+-----------------+-------+------+
 | Book              | Author          | Pages | Date |
 +-------------------+-----------------+-------+------+
 | Norwegian Wood    | Haruki Murakami | 400   | 1987 |
 | Men without Women | Haruki Murakami | 228   | 2017 |
 +-------------------+-----------------+-------+------+

As with regular grep, you can use the -i switch to make it case-insensitive:

 % csvgrep -i wood books.csv
 +-----------------------+-----------------+-------+------+
 | Book                  | Author          | Pages | Date |
 +-----------------------+-----------------+-------+------+
 | Norwegian Wood        | Haruki Murakami | 400   | 1987 |
 | A Walk in the Woods   | Bill Bryson     | 276   | 1997 |
 | Death Walks the Woods | Cyril Hare      | 222   | 1954 |
 +-----------------------+-----------------+-------+------+

You can specify a subset of the columns to display with the -c option, which takes a comma-separated list of column numbers:

 % csvgrep -c 0,1,3 -i mary books.csv
 +--------------+--------------+------+
 | Book         | Author       | Date |
 +--------------+--------------+------+
 | Mary Poppins | PL Travers   | 1934 |
 | Frankenstein | Mary Shelley | 1818 |
 +--------------+--------------+------+

You can also use the title of columns with the -c option:

 % csvgrep -c book,date -i mary books.csv
 +--------------+------+
 | Book         | Date |
 +--------------+------+
 | Mary Poppins | 1934 |
 | Frankenstein | 1818 |
 +--------------+------+

By default the pattern will be matched against the whole line, but you can use --match-column or -mc to specify that the pattern should only be matched against a specific column:

 % csvgrep -mc 0 -c 0,1,3 -i mary books.csv
 +--------------+--------------+------+
 | Book         | Author       | Date |
 +--------------+--------------+------+
 | Mary Poppins | PL Travers   | 1934 |
 +--------------+--------------+------+

The number of the match column refers to the numbering of the full set of columns, regardless of whether you've used the -c option. This means you can match against a column that you're not displaying.

You can also use the column header with the -mc option:

 % csvgrep -mc author -i mary books.csv
 +--------------+--------------+-------+------+
 | Book         | Author       | Pages | Date |
 +--------------+--------------+-------+------+
 | Frankenstein | Mary Shelley | 280   | 1818 |
 +--------------+--------------+-------+------+

The pattern can be a Perl regexp, but you'll probably need to quote it from your shell:

 % csvgrep -i 'walk.*wood' books.csv
 +-----------------------+-------------+-------+------+
 | Book                  | Author      | Pages | Date |
 +-----------------------+-------------+-------+------+
 | A Walk in the Woods   | Bill Bryson | 276   | 1997 |
 | Death Walks the Woods | Cyril Hare  | 222   | 1954 |
 +-----------------------+-------------+-------+------+

At work we have a number of situations where we have a directory that contains multiple versions of a particular CSV file, for example with a feed from a customer. With the -d option, csvgrep will look at the most recent file in the specified directory, only considering files with a .csv or .tsv extension:

 % csvgrep -d /usr/local/feeds/users -i smith

If you want to look at 2 files back, you can use the --back 2 option, or the shorthand version, -2:

 % csvgrep -d /usr/local/feeds/users -2 -i smith

I have various aliases defined, like this:

 alias tg="csvgrep -d .../file.csv -c 0,1,2 -i"

So then I can just run:

 tg smith

This is a script I've used internally, with features being added as I wanted them. Let me know if you've ideas for additional features, or send me a pull request.

Tab-Separated Values

TSV files are pretty common; they use a tab character instead of a comma. If the filename ends with .tsv rather than .csv, we'll set the field separator to be a tab character:

 % csvgrep -i norwegian ~/books.tsv
 +----------------+-----------------+-------+------+
 | Book           | Author          | Pages | Date |
 +----------------+-----------------+-------+------+
 | Norwegian Wood | Haruki Murakami | 400   | 1987 |
 +----------------+-----------------+-------+------+

This also applies to the -d option.

OPTIONS

-c <column-spec>

A comma-separated list of the columns you want displayed, with the first column being 0.

-d <directory-path>

Search the most recently modified .csv or .tsv file in the specified directory, and grep thar.

--back <N> | -<N>

Go N back in the list of files, when using the -d option.

-h

Display short help message.

-i

Case-insensitive grep.

-mc <column-number>

Only search the specified column, which can be specified with the column's name or index (starting at 0).

-t

Use TAB as the field separator. This will be picked automatically for files with a .tsv extension.

REPOSITORY

https://github.com/neilb/csvgrep

AUTHOR

Neil Bowers <neilb@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Neil Bowers <neilb@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.