csvgrep - search for patterns in a CSV and display results in a table
csvgrep <pattern> <file> csvgrep -d <directory> <pattern>
csvgrep is a script that lets you look for a pattern in a CSV file, and then displays the results in a text table. We assume that the first line in the CSV is a header row.
The simplest usage is to look for a word in a CSV:
% csvgrep Murakami books.csv +-------------------+-----------------+-------+------+ | Book | Author | Pages | Date | +-------------------+-----------------+-------+------+ | Norwegian Wood | Haruki Murakami | 400 | 1987 | | Men without Women | Haruki Murakami | 228 | 2017 | +-------------------+-----------------+-------+------+
As with regular grep, you can use the -i switch to make it case-insensitive:
% csvgrep -i wood books.csv +-----------------------+-----------------+-------+------+ | Book | Author | Pages | Date | +-----------------------+-----------------+-------+------+ | Norwegian Wood | Haruki Murakami | 400 | 1987 | | A Walk in the Woods | Bill Bryson | 276 | 1997 | | Death Walks the Woods | Cyril Hare | 222 | 1954 | +-----------------------+-----------------+-------+------+
You can specify a subset of the columns to display with the -c option, which takes a comma-separated list of column numbers:
% csvgrep -c 0,1,3 -i mary books.csv +--------------+--------------+------+ | Book | Author | Date | +--------------+--------------+------+ | Mary Poppins | PL Travers | 1934 | | Frankenstein | Mary Shelley | 1818 | +--------------+--------------+------+
You can also use the title of columns with the -c option:
% csvgrep -c book,date -i mary books.csv +--------------+------+ | Book | Date | +--------------+------+ | Mary Poppins | 1934 | | Frankenstein | 1818 | +--------------+------+
By default the pattern will be matched against the whole line, but you can use --match-column or -mc to specify that the pattern should only be matched against a specific column:
% csvgrep -mc 0 -c 0,1,3 -i mary books.csv +--------------+--------------+------+ | Book | Author | Date | +--------------+--------------+------+ | Mary Poppins | PL Travers | 1934 | +--------------+--------------+------+
The number of the match column refers to the numbering of the full set of columns, regardless of whether you've used the -c option. This means you can match against a column that you're not displaying.
You can also use the column header with the -mc option:
% csvgrep -mc author -i mary books.csv +--------------+--------------+-------+------+ | Book | Author | Pages | Date | +--------------+--------------+-------+------+ | Frankenstein | Mary Shelley | 280 | 1818 | +--------------+--------------+-------+------+
The pattern can be a Perl regexp, but you'll probably need to quote it from your shell:
% csvgrep -i 'walk.*wood' books.csv +-----------------------+-------------+-------+------+ | Book | Author | Pages | Date | +-----------------------+-------------+-------+------+ | A Walk in the Woods | Bill Bryson | 276 | 1997 | | Death Walks the Woods | Cyril Hare | 222 | 1954 | +-----------------------+-------------+-------+------+
At work we have a number of situations where we have a directory that contains multiple versions of a particular CSV file, for example with a feed from a customer. With the -d option, csvgrep will look at the most recent file in the specified directory, only considering files with a .csv or .tsv extension:
.csv
.tsv
% csvgrep -d /usr/local/feeds/users -i smith
If you want to look at 2 files back, you can use the --back 2 option, or the shorthand version, -2:
--back 2
-2
% csvgrep -d /usr/local/feeds/users -2 -i smith
I have various aliases defined, like this:
alias tg="csvgrep -d .../file.csv -c 0,1,2 -i"
So then I can just run:
tg smith
This is a script I've used internally, with features being added as I wanted them. Let me know if you've ideas for additional features, or send me a pull request.
TSV files are pretty common; they use a tab character instead of a comma. If the filename ends with .tsv rather than .csv, we'll set the field separator to be a tab character:
% csvgrep -i norwegian ~/books.tsv +----------------+-----------------+-------+------+ | Book | Author | Pages | Date | +----------------+-----------------+-------+------+ | Norwegian Wood | Haruki Murakami | 400 | 1987 | +----------------+-----------------+-------+------+
This also applies to the -d option.
A comma-separated list of the columns you want displayed, with the first column being 0.
Search the most recently modified .csv or .tsv file in the specified directory, and grep thar.
Go N back in the list of files, when using the -d option.
-d
Display short help message.
Case-insensitive grep.
Only search the specified column, which can be specified with the column's name or index (starting at 0).
Use TAB as the field separator. This will be picked automatically for files with a .tsv extension.
https://github.com/neilb/csvgrep
Neil Bowers <neilb@cpan.org>
This software is copyright (c) 2017 by Neil Bowers <neilb@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install csvgrep, copy and paste the appropriate command in to your terminal.
cpanm
cpanm csvgrep
CPAN shell
perl -MCPAN -e shell install csvgrep
For more information on module installation, please visit the detailed CPAN module installation guide.