fsql - Perform SQL queries against files in CSV/TSV/LTSV/JSON/YAML formats
This document describes version 0.231 of fsql (from Perl distribution App-fsql), released on 2021-09-10.
fsql [OPTIONS] [ <QUERY> | --show-schema|-s ]
fsql lets you perform SQL queries against one or several "flat" files of various formats. Each file will be regarded as a SQL table. By using SQL queries, you can do various calculations or manipulations that are otherwise hard/cumbersome to do with traditional text-manipulating Unix commands like cut, sort, head, tail, uniq, and so on. Particularly: data grouping, joining, or filtering with SQL expressions and functions.
As a bonus, you can also modify data (currently CSV only) via SQL INSERT or DELETE commands.
The query result will then be printed out, in one of several available formats.
The magic of all this is performed by DBD::CSV and SQL::Statement.
To use fsql, you must at least specify one file/table (with --add or one of the --add-TYPE options). If none of those options are specified, a table is assumed in STDIN with name stdin and the format will be detected. You must also specify a SQL query.
fsql
--add
--add-TYPE
stdin
0 on success.
255 on I/O or SQL error.
99 on command-line options or input data error.
--add=FILENAME[:TABLENAME|OPT1=VAL1:...], -a
Add a table from a file. Type will be detected from filename extension (and some heuristics, if there is no file extension or extension is unrecognized). Die if type cannot be detected.
Sometimes the detection will miss. Alternatively, you can use one of the --add-TYPE options to add a specific table type.
If TABLENAME is not specified, it will be taken from FILENAME (e.g. with filename foo-bar.csv, table name will be foo_bar). Will croak if duplicate table name is detected. Table name must match regex /\A[A-Za-z_][A-Za-z_0-9]*\z/.
TABLENAME
FILENAME
foo-bar.csv
foo_bar
/\A[A-Za-z_][A-Za-z_0-9]*\z/
FILENAME can be - to mean the standard input (the default table name will be stdin).
-
Alternatively, you can specify table name as well as other option using the OPT1=VAL1:... syntax. Known options include:
Specify table name.
Default is 1. If set to 0, specifies that CSV/TSV file does not contain header row that contains column names. Column names will be col0, col1, col2, and so on. Has no effect on file types other than CSV/TSV.
col0
col1
col2
--add-csv=FILENAME[:TABLENAME|OPT1=VAL1:...]
Add a table from a CSV file.
--add-tsv=FILENAME[:TABLENAME|OPT1=VAL1:...]
Like --add-csv, but will load file as TSV (tab-separated value).
--add-csv
--add-ltsv=FILENAME[:TABLENAME|OPT1=VAL1:...]
Like --add-csv, but will load file as LTSV (labeled tab separated value, see Text::LTSV). Names of columns will be taken from the first row.
--add-json=FILENAME[:TABLENAME|OPT1=VAL1:...]
Like --add-csv, but will load file as JSON.
Data can be array, or array of arrays, or array of hashes, or an enveloped response (see Rinci::function), so it is suitable to accept piped output of Perinci::CmdLine-based programs.
--add-yaml=FILENAME[:TABLENAME|OPT1=VAL1:...]
Like --add-json, but will load file as YAML.
--add-json
--add-perl=FILENAME[:TABLENAME|OPT1=VAL1:...]
Like --add-json, but will load file as Perl.
--load-function=NAME, -F
Load a SQL function. This will load Perl module SQL::Statement::Function::ByName::NAME. See CPAN for list of available modules.
SQL::Statement::Function::ByName::NAME
If you use NAME:NEWNAME, you can load a SQL function as another name, for example:
NAME:NEWNAME
-F DAYOFYEAR:DOY
--define-function=NAME:CODE, -D
Define a SQL function. You need to specify name as well as perl code that implements it. Perl code will be wrapped in a subroutine, it should expect the function argument in $_[2] (for more details see SQL::Statement::Functions). Example:
$_[2]
--define-function 'MONTH:$_[2] =~ /^\d{4}-(\d{2})-/ or return undef; $1+0'
--aoa
Return array of array (the default). Only relevant to outputs like perl, json, yaml, text.
perl
json
yaml
text
--aoh
Return array of hashes instead of the default array of array, where each row is represented as a hash (dictionary/associated array) instead of an array. Only relevant to output formats like perl, json, yaml, text.
Returning a hash is convenient when you want column name information on each row, but you can't specify the same column twice and order of columns are not guaranteed.
--format=FORMAT (default: text), -f
Set output format.
The value csv or tsv or ltsv will cause query results to be output as a comma-separated or TAB-separated list or labeled-TAB separated list, respectively. As this isn't very useful for a schema listing, these values will be silently converted to text if --show-schema (-s) is also present.
csv
tsv
ltsv
--show-schema
-s
The other values perl, json, yaml, and text will be formatted using appropriate Data::Format::Pretty formatter.
The default value is the most used table format. So if your tables are mostly CSV, fsql will also output CSV by default.
--show-schema, -s
Instead of running a query, show schema instead. This is useful for debugging.
See SQL::Statement::Syntax for the range of supported SQL syntax. In short, you can do select with several kinds of joins, almost/around 100 builtin functions (with additional functions available from Perl modules, see next FAQ entry), etc.
Also, sometimes if there is SQL parsing error, the error message is not immediately obvious and can be a bit confusing. That is usually a parsing limitation/bug within SQL::Statement.
SQL::Statement allows loading Perl functions as SQL functions. There are several CPAN modules (see the SQL::Statement::Function::ByName:: namespace) which nicely package them so you can load them from fsql simply by using the -F option, e.g. to load the YEAR() function:
SQL::Statement::Function::ByName::
-F
% fsql -F YEAR --add-csv sometable.csv 'SELECT c1, YEAR(c2) FROM sometable ...'
If set to true, print debugging messages.
If set to true, will not cleanup tempdir.
Filter CSV (table from stdin is aptly named so):
% prog-that-produces-csv | fsql 'SELECT id,name FROM stdin WHERE id <= 1000' > final.csv
Pick output format, produce array of hashes instead of the default array of arrays:
% fsql -a ~/book.pl 'SELECT title,name FROM book WHERE year >= 2010' --aoh -f json
You can perform joins, of course:
% fsql -a t.json -a 2.csv:t2 'SELECT * FROM t1 LEFT JOIN t2 ON t1.uid=t2.id'
Show schema:
% fsql -a table1.json -a 2.csv:table2 -s
Insert row to CSV:
% fsql -a file.csv 'INSERT INTO file VALUES (1, 2, 3)'
Delete rows from CSV:
% fsql -a file.csv 'DELETE FROM file WHERE c1 < 10'
Input is CSV or TSV without header row (columns will be named 'col0', 'col1', and so on):
% fsql -a file.csv:header=0:table=t 'SELECT col1,col3 FROM t WHERE col0 <= 1000' > final.csv
Please visit the project's homepage at https://metacpan.org/release/App-fsql.
Source repository is at https://github.com/perlancar/perl-App-fsql.
perlancar <perlancar@cpan.org>
To contribute, you can send patches by email/via RT, or send pull requests on GitHub.
Most of the time, you don't need to build the distribution yourself. You can simply modify the code, then test via:
% prove -l
If you want to build the distribution (e.g. to try to install it locally on your system), you can install Dist::Zilla, Dist::Zilla::PluginBundle::Author::PERLANCAR, and sometimes one or two other Dist::Zilla plugin and/or Pod::Weaver::Plugin. Any additional steps required beyond that are considered a bug and can be reported to me.
This software is copyright (c) 2021, 2019, 2016, 2015, 2014 by perlancar <perlancar@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-fsql
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
To install App::fsql, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::fsql
CPAN shell
perl -MCPAN -e shell install App::fsql
For more information on module installation, please visit the detailed CPAN module installation guide.