picadata - parse and validate PICA+ data
picadata [<command>] {path} {options} {files}
Convert, analyze and validate PICA+ data from the command line.
Convert between PICA+ serialization formats (the default command).
Print subfield values.
Split records into multiple records for each level. Implies -o.
-o
Join multiple records into one and sort afterwards.
Count number of records, holdings, items, and fields.
Filter records that include any of some given (sub)fields.
List distinct fields or subfields in the data. Provide an Avram schema (-s/--schema) to include documenation.
-s/--schema
Lookup (sub)fields in an Avram schema given by option or from stdin. Optional (o/*), mandatory (./+), repeatable (+/*).
o
*
.
+
Validate data against an Avram schema (-s/--schema).
Compare PICA records from two inputs. Output is always annotated PICA Plain.
Apply modifications given in annotated PICA Plain.
Change subfield values and return result or patch (option -a).
-a
Build an Avram schema from input data, optionally based on an existing schema (-s/--schema). Add option -B/--abbrev to abbreviate.
-B/--abbrev
PICA serialization type (plain, plus/normalized, binary, import, XML, ppxml, pixml, patch) with Plain as default. Guessed from first input filename unless specified. See format documentation at http://format.gbv.de/pica.
PICA serialization type to enable writing parsed PICA data.
Stop parsing after n records. Can be abbreviated as -1, -2...
n
-1
-2
Sort record fields by field identifier and by occurrence at level 2.
Split record into selected level, includes higher level identifiers.
Enforce annotated PICA as output format or prevent with -A. Combined with --schema this will set annotations ! and ? to mark validation errors.
-A
--schema
!
?
Select fields or subfield values specified by PICA Path expressions. Multiple expressions can be separated by | or by repeating the option. Positions such as /3-7 are read as occurrence ranges.
|
/3-7
Avram Schema given by file or URL. Default set via environment variable PICA_SCHEMA.
PICA_SCHEMA
Report unknown fields and subfields on validation (disabled by default).
Abbreviate the Avram schema (with command <build>).
Colorize output. Only supported for PICA plain and PICA plus format.
Monochrome (don't colorize output).
Print version number and exit.
picadata pica.dat -t xml # convert binary to XML picadata count -f plain < pica.plain # parse and count records picadata 003@ pica.xml # extract field 003@ picadata validate pica.xml -s schema.json # validate against Avram schema picadata modify 021A.a "New Title" pica.pp # modify subfield value # document fields used in a record picadata fields pica.xml -s https://format.k10plus.de/avram.pl?profile=k10plus
See catmandu for a more elaborated command line tool for data processing (transformation, API access...), including PICA+ with Catmandu::PICA.
To install PICA::Data, copy and paste the appropriate command in to your terminal.
cpanm
cpanm PICA::Data
CPAN shell
perl -MCPAN -e shell install PICA::Data
For more information on module installation, please visit the detailed CPAN module installation guide.