Catmandu::Importer - Namespace for packages that can import
# From the command line # JSON is an importer and YAML an exporter $ catmandu convert JSON to YAML < data.json # OAI is an importer and JSON an exporter $ catmandu convert OAI --url http://biblio.ugent.be/oai to JSON # Fetch remote content $ catmandu convert JSON --file http://example.com/data.json to YAML # From Perl use Catmandu; use Data::Dumper; my $importer = Catmandu->importer('JSON', file => 'data.json'); $importer->each(sub { my $item = shift; print Dumper($item); }); my $num = $importer->count; my $first_item = $importer->first; # Convert OAI to JSON in Perl my $importer = Catmandu->importer('OAI', url => 'http://biblio.ugent.be/oai'); my $exporter = Catmandu->exporter('JSON'); $exporter->add_many($importer);
A Catmandu::Importer is a Perl package that can generate structured data from sources such as JSON, YAML, XML, RDF or network protocols such as Atom, OAI-PMH, SRU and even DBI databases. Given an Catmandu::Importer a programmer can read data from using one of the many Catmandu::Iterable methods:
$importer->to_array; $importer->count; $importer->each(\&callback); $importer->first; $importer->rest; ...etc...
Every Catmandu::Importer is also Catmandu::Fixable and thus inherits a 'fix' parameter that can be set in the constructor. When given a 'fix' parameter, then each item returned by the generator will be automatically Fixed using one or more Catmandu::Fixes. E.g.
my $importer = Catmandu->importer('JSON',fix => ['upcase(title)']); $importer->each( sub { my $item = shift ; # Every $item->{title} is now upcased... }); # or via a Fix file my $importer = Catmandu->importer('JSON',fix => ['/my/fixes.txt']); $importer->each( sub { my $item = shift ; # Every $item->{title} is now upcased... });
Read input from a local file given by its path. If the path looks like a url, the content will be fetched first and then passed to the importer. Alternatively a scalar reference can be passed to read from a string.
Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the file argument or by using STDIN.
file
Binmode of the input stream fh. Set to :utf8 by default.
fh
:utf8
An ARRAY of one or more Fix-es or Fix scripts to be applied to imported items.
The data at data_path is imported instead of the original data.
data_path
# given this imported item: {abc => [{a=>1},{b=>2},{c=>3}]} # with data_path 'abc', this item gets imported instead: [{a=>1},{b=>2},{c=>3}] # with data_path 'abc.*', 3 items get imported: {a=>1} {b=>2} {c=>3}
Variables given here will interpolate the file and http_body options. The syntax is the same as URI::Template.
http_body
# named arguments my $importer = Catmandu->importer('JSON', file => 'http://{server}/{path}', variables => {server => 'biblio.ugent.be', path => 'file.json'}, ); # positional arguments my $importer = Catmandu->importer('JSON', file => 'http://{server}/{path}', variables => 'biblio.ugent.be,file.json', ); # or my $importer = Catmandu->importer('JSON', url => 'http://{server}/{path}', variables => ['biblio.ugent.be','file.json'], ); # or via the command line $ catmandu convert JSON --file 'http://{server}/{path}' --variables 'biblio.ugent.be,file.json'
These options are only relevant if file is a url. See LWP::UserAgent for details about these options.
Set the GET/POST message body.
Set the type of HTTP request 'GET', 'POST' , ...
A reference to a HTTP::Headers objects.
Set an own HTTP client
A string containing the name of the HTTP client.
Maximum number of HTTP redirects allowed.
Maximum execution time.
Verify the SSL certificate.
Maximum times to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry.
Maximum times and timeouts to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry and the format of the timing value.
See Catmandu::Iterable for all inherited methods.
Create your own importer by creating a Perl package in the Catmandu::Importer namespace that implements Catmandu::Importer. Basically, you need to create a method 'generate' which returns a callback that creates one Perl hash for each call:
Catmandu::Importer
my $importer = Catmandu::Importer::Hello->new; $importer->generate(); # record $importer->generate(); # next record $importer->generate(); # undef = end of stream
Here is an example of a simple Hello importer:
Hello
package Catmandu::Importer::Hello; use Catmandu::Sane; use Moo; with 'Catmandu::Importer'; sub generator { my ($self) = @_; state $fh = $self->fh; my $n = 0; return sub { $self->log->debug("generating record " . ++$n); my $name = $self->fh->readline; return defined $name ? { "hello" => $name } : undef; }; } 1;
This importer can be called via the command line as:
$ catmandu convert Hello to JSON < /tmp/names.txt $ catmandu convert Hello to YAML < /tmp/names.txt $ catmandu import Hello to MongoDB --database_name test < /tmp/names.txt
Or, via Perl
use Catmandu; my $importer = Catmandu->importer('Hello', file => '/tmp/names.txt'); $importer->each(sub { my $items = shift; });
Catmandu::Iterable , Catmandu::Fix , Catmandu::Importer::CSV, Catmandu::Importer::JSON , Catmandu::Importer::YAML
To install Catmandu, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Catmandu
CPAN shell
perl -MCPAN -e shell install Catmandu
For more information on module installation, please visit the detailed CPAN module installation guide.