Biblio::Citation::Parser 1.10 Documentation - Reference Parsing


Parsing References

Biblio::Citation::Parser is designed for parsing citations, and this can be done very simply:

 use Biblio::Citation::Parser::Standard;
 my $parser = new Biblio::Citation::Parser::Standard();
 my $metadata = $parser->parse("Jewell, M (2002) Parsing Examples");

The $metadata variable is a hash containing the information extracted from the reference.

If you'd prefer to use another parser, simply substitute the 'Standard' for the appropriate module. Biblio::Citation::Parser is distributed with the Jiao module, which is a slightly modified version of a module created by Zhuoan Jiao. To use this instead of the Standard module, you would do the following:

 use Biblio::Citation::Parser::Jiao;
 my $parser = new Biblio::Citation::Parser::Jiao();
 my $metadata = $parser->parse("Jewell, M (2002) Parsing Examples");

The Standard module provides slightly richer metadata than the Jiao module, but it does rely on templates (see Biblio::Citation::Parser::Templates) so requires updating as new citation formats are found.


Creating an OpenURL

Once you have the metadata from the reference, it is easy to create an OpenURL from it:

 use Biblio::Citation::Parser::Standard;
 use Biblio::Citation::Parser::Utils;
 my $parser = new Biblio::Citation::Parser::Standard();
 my $metadata = $parser->parse("Jewell, M (2002) Parsing Examples");
 my $openurl = create_openurl($metadata);

The OpenURLs created by Biblio::Citation::Parser do not have a Base URL prefixed, so this should be carried out before they are used (the ParaCite base URL is http://paracite.eprints.org/cgi-bin/openurl.cgi).

If you would like to try to extract more information from the metadata, you can use the decompose_openurl function:

 my ($enriched_metadata, @errors) = decompose_openurl($metadata);
 
This tries to extract information from SICIs, page ranges, etc, and also checks the fields for validity (the C<@errors> array contains any mistakes).

Note that the create_openurl has been superceded by URI::OpenURL, but the metadata returned by trim_openurl is in the correct format to be passed to this module.


Metadata Structure

Biblio::Citation::Parser supports all of the fields specified in Table 1 of the OpenURL specification (http://www.sfxit.com/openurl/openurl.html). Specific parsers can add their own fields, but these are not exported when OpenURLs are created. Biblio::Citation::Parser::Standard provides the following extra fields:

marked
A marked-up version of the reference. e.g. <author>Jewell, M</author> (<year>2002</year>) <title>A title</title>.

match
The template matched by Biblio::Citation::Parser::Standard

ref
The original reference

 Biblio::Citation::Parser 1.10 Documentation - Reference Parsing