NAME

MARC::Moose::Lint::Checker - A class to 'lint' biblio record based on a rules file

VERSION

version 1.0.9

ATTRIBUTES

file

Name of the file containing validation rules based on which a biblio record can be validated.

METHODS

check( record )

This method checks a biblio record, based on the current 'lint' object. The biblio record is a MARC::Moose::Record object. An array of validation errors/warnings is returned. Those errors are just plain text explanation on the reasons why the record doesn't comply with validation rules.

SYNOPSYS

 use MARC::Moose::Record;
 use MARC::Moose::Reader::File::Iso2709;
 use MARC::Moose::Lint::Checker;

 # Read an ISO2709 file, and dump found errors
 my $reader = MARC::Moose::Reader::File::Iso2709->new(
     file => 'biblio.mrc' );
 my $lint = MARC::Moose::Lint::Checker->new(
     file => 'unimarc.rules' );
 while ( my $record = $reader->read() ) {
     if ( my @result = $lint->check($record) ) {
         say "Biblio record #", $record->field('001')->value;
         say join("\n", @result), "\n";
     }
 }

VALIDATION RULES

Validation rules are defined in a textual form. The file is composed of two parts: (1) field rules, (2) validation tables.

(1) Field rules define validation rules for each tag. A blank line separates tags. For example:

 102+
 #
 #
 a+CTRY
 b+
 c+
 2+

Line 1 contains the field tag. If a + is present, the field is repeatable. If a _ is present, the field is mandatory. Line 2 and 3 contains a regular expression on which indicators 1 and 2 are validated. # means a blank indicator. Line 4 and the following define rules for validating subfields. A first part contains subfield's letter, and + (repeatable) and/or _ (mandatory), followed by an optional validation table name. A blank separates the first part from the second part. The second part contains a reular expression on which subfield content is validated.

(2) Validation tables part of the file allow to define several validation tables. The table name begins with ==== TABLE NAME in uppercase. Then each line is a code in the validation table.

This is for example, a simplified standard UNIMARC validation rules file:

 100_
 #
 #
 a ^[0-9]{8}[a-ku][0-9 ]{8}[abcdeklu ]{3}[a-huyz][01 ][a-z]{3}[a-cy][01|02|03|04|05|06|07|08|09|10|11|50]{2}
 
 101
 0,1,2
 #
 a+LANG ^[a-z]{3}$
 b+LANG ^[a-z]{3}$
 c+LANG ^[a-z]{3}$
 d+ :^[a-z]{3}$
 f+ ^[a-z]{3}$
 g+ ^[a-z]{3}$
 h+ ^[a-z]{3}$
 i+ ^[a-z]{3}$
 j+ ^[a-z]{3}$
 
 102
 #
 #
 a+CTRY
 b+
 c+
 2+
 
 105
 #
 #
 a ^.{13}$
 
 106
 #
 #
 a ^.{1}$
 
 200_
 0|1
 #|1
 a_+
 b+
 c+
 d+
 e+
 f+
 g+
 h+
 i+
 v
 z+
 5+
 
 205+
 #
 #      
 a 
 b+ 
 d+ 
 f+ 
 g+
 
 206+
 #|0     
 #      
 a 
 b+ 
 c 
 d 
 e 
 f
 
 207   
 #       
 0|1    
 a+
 z+ ^.{7}$
 
 ==== CTRY
 AF
 AL
 DZ
 GG
 GN
 GW
 GY
 HT
 HM
 VE
 VN
 VG
 VI
 ZM
 ZW
 
 ==== LANG
 aar
 afh
 afr
 afa
 ain
 aka
 akk

SEE ALSO

AUTHOR

Frédéric Demians <f.demians@tamil.fr>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Frédéric Demians.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.