-
-
21 Mar 2022 15:23:34 UTC
- Distribution: Catmandu
- Module version: 1.2019
- Source (raw)
- Browse (raw)
- Changes
- Homepage
- How to Contribute
- Repository
- Issues (36)
- Testers (576 / 0 / 7)
- Kwalitee
Bus factor: 4- 87.59% Coverage
- License: perl_5
- Perl: v5.14.0
- Activity
24 month- Tools
- Download (814.59KB)
- MetaCPAN Explorer
- Permissions
- Subscribe to distribution
- Permalinks
- This version
- Latest version
and 19 contributors- Nicolas Steenlant, C<< <nicolas.steenlant at ugent.be> >>
-
Christian Pietsch
-
Dave Sherohman
-
Doug Bell
-
EC2 Default User
-
Jakob Voß
-
Johann Rolschewski
-
Magnus Enger
-
Matthias Vandermaesen
-
Mohammad S Anwar
-
Nicolas Franck
-
Patrick Hochstenbach
-
Pieter De Praetere
-
Snorri Briem
-
Stefan Weil
-
Tom Hukins
-
Upasana Shukla
-
Vitali Peil
-
Zakariyya Mughal
- Dependencies
- Any::URI::Escape
- App::Cmd
- CGI::Expand
- Class::Method::Modifiers
- Clone
- Config::Onion
- Cpanel::JSON::XS
- Data::Compare
- Data::Util
- HTTP::Request
- Hash::Merge::Simple
- IO::Handle::Util
- LWP::UserAgent
- List::MoreUtils
- Log::Any
- Log::Any::Adapter
- MIME::Types
- Module::Build
- Module::Info
- Moo
- MooX::Aliases
- Package::Stash
- Parser::MGC
- Path::Iterator::Rule
- Path::Tiny
- Role::Tiny
- Role::Tiny::With
- String::CamelCase
- Sub::Exporter
- Sub::Quote
- Text::CSV
- Text::Hogan::Compiler
- Throwable
- Time::HiRes
- Try::Tiny::ByClass
- URI
- URI::Template
- UUID::Tiny
- Unicode::Normalize
- YAML::XS
- asa
- namespace::clean
- Reverse dependencies
- CPAN Testers List
- Dependency graph
NAME
Catmandu::Importer - Namespace for packages that can import
SYNOPSIS
# From the command line # JSON is an importer and YAML an exporter $ catmandu convert JSON to YAML < data.json # OAI is an importer and JSON an exporter $ catmandu convert OAI --url http://biblio.ugent.be/oai to JSON # Fetch remote content $ catmandu convert JSON --file http://example.com/data.json to YAML # From Perl use Catmandu; use Data::Dumper; my $importer = Catmandu->importer('JSON', file => 'data.json'); $importer->each(sub { my $item = shift; print Dumper($item); }); my $num = $importer->count; my $first_item = $importer->first; # Convert OAI to JSON in Perl my $importer = Catmandu->importer('OAI', url => 'http://biblio.ugent.be/oai'); my $exporter = Catmandu->exporter('JSON'); $exporter->add_many($importer);
DESCRIPTION
A Catmandu::Importer is a Perl package that can generate structured data from sources such as JSON, YAML, XML, RDF or network protocols such as Atom, OAI-PMH, SRU and even DBI databases. Given an Catmandu::Importer a programmer can read data from using one of the many Catmandu::Iterable methods:
$importer->to_array; $importer->count; $importer->each(\&callback); $importer->first; $importer->rest; ...etc...
Every Catmandu::Importer is also Catmandu::Fixable and thus inherits a 'fix' parameter that can be set in the constructor. When given a 'fix' parameter, then each item returned by the generator will be automatically Fixed using one or more Catmandu::Fixes. E.g.
my $importer = Catmandu->importer('JSON',fix => ['upcase(title)']); $importer->each( sub { my $item = shift ; # Every $item->{title} is now upcased... }); # or via a Fix file my $importer = Catmandu->importer('JSON',fix => ['/my/fixes.txt']); $importer->each( sub { my $item = shift ; # Every $item->{title} is now upcased... });
CONFIGURATION
- file
-
Read input from a local file given by its path. If the path looks like a url, the content will be fetched first and then passed to the importer. Alternatively a scalar reference can be passed to read from a string.
- fh
-
Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the
file
argument or by using STDIN. - encoding
-
Binmode of the input stream
fh
. Set to:utf8
by default. - fix
-
An ARRAY of one or more Fix-es or Fix scripts to be applied to imported items.
- data_path
-
The data at
data_path
is imported instead of the original data.# given this imported item: {abc => [{a=>1},{b=>2},{c=>3}]} # with data_path 'abc', this item gets imported instead: [{a=>1},{b=>2},{c=>3}] # with data_path 'abc.*', 3 items get imported: {a=>1} {b=>2} {c=>3}
- variables
-
Variables given here will interpolate the
file
andhttp_body
options. The syntax is the same as URI::Template.# named arguments my $importer = Catmandu->importer('JSON', file => 'http://{server}/{path}', variables => {server => 'biblio.ugent.be', path => 'file.json'}, ); # positional arguments my $importer = Catmandu->importer('JSON', file => 'http://{server}/{path}', variables => 'biblio.ugent.be,file.json', ); # or my $importer = Catmandu->importer('JSON', url => 'http://{server}/{path}', variables => ['biblio.ugent.be','file.json'], ); # or via the command line $ catmandu convert JSON --file 'http://{server}/{path}' --variables 'biblio.ugent.be,file.json'
HTTP CONFIGURATION
These options are only relevant if
file
is a url. See LWP::UserAgent for details about these options.- http_body
-
Set the GET/POST message body.
- http_method
-
Set the type of HTTP request 'GET', 'POST' , ...
- http_headers
-
A reference to a HTTP::Headers objects.
Set an own HTTP client
Alternative set the parameters of the default client
- http_agent
-
A string containing the name of the HTTP client.
- http_max_redirect
-
Maximum number of HTTP redirects allowed.
- http_timeout
-
Maximum execution time.
- http_verify_hostname
-
Verify the SSL certificate.
- http_retry
-
Maximum times to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::UserAgent::Determined for the HTTP status codes that initiate a retry.
- http_timing
-
Maximum times and timeouts to retry the HTTP request if it temporarily fails. Default is not to retry. See LWP::UserAgent::Determined for the HTTP status codes that initiate a retry and the format of the timing value.
METHODS
first, each, rest , ...
See Catmandu::Iterable for all inherited methods.
CODING
Create your own importer by creating a Perl package in the Catmandu::Importer namespace that implements
Catmandu::Importer
. Basically, you need to create a method 'generate' which returns a callback that creates one Perl hash for each call:my $importer = Catmandu::Importer::Hello->new; $importer->generate(); # record $importer->generate(); # next record $importer->generate(); # undef = end of stream
Here is an example of a simple
Hello
importer:package Catmandu::Importer::Hello; use Catmandu::Sane; use Moo; with 'Catmandu::Importer'; sub generator { my ($self) = @_; state $fh = $self->fh; my $n = 0; return sub { $self->log->debug("generating record " . ++$n); my $name = $self->fh->readline; return defined $name ? { "hello" => $name } : undef; }; } 1;
This importer can be called via the command line as:
$ catmandu convert Hello to JSON < /tmp/names.txt $ catmandu convert Hello to YAML < /tmp/names.txt $ catmandu import Hello to MongoDB --database_name test < /tmp/names.txt
Or, via Perl
use Catmandu; my $importer = Catmandu->importer('Hello', file => '/tmp/names.txt'); $importer->each(sub { my $items = shift; });
SEE ALSO
Catmandu::Iterable , Catmandu::Fix , Catmandu::Importer::CSV, Catmandu::Importer::JSON , Catmandu::Importer::YAML
Module Install Instructions
To install Catmandu, copy and paste the appropriate command in to your terminal.
cpanm Catmandu
perl -MCPAN -e shell install Catmandu
For more information on module installation, please visit the detailed CPAN module installation guide.