parsepica - parse PICA+ data and print summary information
parsepica.pl [options] [files...]
-help brief help message -man full documentation with examples -log FILE print logging to a given file ('-': STDOUT, default) -input FILE file with input files on each line ('-': STDIN) -output FILE print all valid records to a given file ('-': STDOUT) -xml output of records in XML -quiet supress logging -select select a specific field (no XML output possible yet) -pselect select (sub)fields and print values
Not fully implemented yet: -bad FILE print invalid records to a given file ('-': STDOUT) -sru SRU fetch records via SRU. command line arguments are cql statements instead of files
This script demonstrates how to use the Perl PICA module. It can be used to check and count records. Input files can be specified as arguments or from an input file list. Compressed files (.gz) can directly be read. If no input file is specified then input is read from STDIN.
.gz
Logging information is printed to STDOUT (unless quiet mode is set) or to a specified logfile. Read records can be written back to a given file or to STDOUT ('-') . Records that cannot be parseded produce error messages to STDERR.
Selecting fields with parsepica is around half as fast as using grep, but grep does not really parse and check for wellformedness.
Read records from 'picadata' and print parseable records to 'checkedrecords'.
Select all fields '021A' from 'picadata' and write to STDOUT.
Error handling needs to be implemented to collect broken records.
Examples to implement:
parsepica.pl -b errors picadata
Parse records in picadata and print records that are not wellformed to errors. The number of records will be reported.
picadata
errors
parsepica.pl -out checked -bad errors -quiet picadata.gz
Parse records in picadata.gz. Print records that are wellformed to checked and the other records to errors. Supress any messages.
picadata.gz
checked
Jakob Voss jakob.voss@gbv.de
jakob.voss@gbv.de
To install PICA::Record, copy and paste the appropriate command in to your terminal.
cpanm
cpanm PICA::Record
CPAN shell
perl -MCPAN -e shell install PICA::Record
For more information on module installation, please visit the detailed CPAN module installation guide.