The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

XRD::Parser - Parse XRD files into RDF::Trine models

VERSION

0.03

SYNOPSIS

  use RDF::Query;
  use XRD::Parser;
  
  my $parser = XRD::Parser->new(undef, "http://example.com/foo.xrd");
  $parser->consume;
  
  my $results = RDF::Query->new(
    "SELECT * WHERE {?who <http://spec.example.net/auth/1.0> ?auth.}")
    ->execute($parser->graph);
        
  while (my $result = $results->next)
  {
    print $result->{'auth'}->uri . "\n";
  }

DESCRIPTION

While XRD has a rather different history, it turns out it can mostly be thought of as a serialisation format for a limited subset of RDF.

This package ignores the order of <Link> elements, as RDF is a graph format with no concept of statements coming in an "order". The XRD spec says that grokking the order of <Link> elements is only a SHOULD. That said, if you're concerned about the order of <Link> elements, the callback routines allowed by this package may be of use.

This package aims to be roughly compatible with RDF::RDFa::Parser's interface.

$p = XRD::Parser->new($content, $uri, \%options, $store);

This method creates a new XRD::Parser object and returns it.

The $content variable may contain an XML string, or a XML::LibXML::Document. If a string, the document is parsed using XML::LibXML::Parser, which may throw an exception. XRD::Parser does not catch the exception.

$uri the supposed URI of the content; it is used to resolve any relative URIs found in the XRD document. Also, if $content is empty, then XRD::Parser will attempt to retrieve $uri using LWP::Simple.

Options [default in brackets]:

  * tdb_service     - thing-described-by.org when possible. [0] 

$storage is an RDF::Trine::Storage object. If undef, then a new temporary store is created.

$p = XRD::Parser->hostmeta($uri);

This method creates a new XRD::Parser object and returns it.

The parameter may be a URI (from which the hostname will be extracted) or just a bare host name (e.g. "example.com"). The resource "/.well-known/host-meta" will then be fetched from that host using an appropriate HTTP Accept header, and the parser object returned.

$p->uri

Returns the base URI of the document being parsed. This will usually be the same as the base URI provided to the constructor.

Optionally it may be passed a parameter - an absolute or relative URI - in which case it returns the same URI which it was passed as a parameter, but as an absolute URI, resolved relative to the document's base URI.

This seems like two unrelated functions, but if you consider the consequence of passing a relative URI consisting of a zero-length string, it in fact makes sense.

$p->dom

Returns the parsed XML::LibXML::Document.

$p->set_callbacks(\&func1, \&func2)

Set callbacks for handling RDF triples extracted from the document. The first function is called when a triple is generated taking the form of (resource, resource, resource). The second function is called when a triple is generated taking the form of (resource, resource, literal).

The parameters passed to the first callback function are:

  • A reference to the XRD::Parser object

  • A reference to the XML::LibXML element being parsed

  • Subject URI or bnode

  • Predicate URI

  • Object URI or bnode

The parameters passed to the second callback function are:

  • A reference to the XRD::Parser object

  • A reference to the XML::LibXML element being parsed

  • Subject URI or bnode

  • Predicate URI

  • Object literal

  • Datatype URI (possibly undef or '')

  • Language (possibly undef or '')

In place of either or both functions you can use the string 'print' which sets the callback to a built-in function which prints the triples to STDOUT as Turtle. Either or both can be set to undef, in which case, no callback is called when a triple is found.

Beware that for literal callbacks, sometimes both a datatype *and* a language will be passed. (This goes beyond the normal RDF data model.)

set_callbacks (if used) must be used before consume.

$p->consume;

This method processes the input DOM and sends the resulting triples to the callback functions (if any).

$p->graph()

This method will return an RDF::Trine::Model object with all statements of the full graph.

It makes sense to call consume before calling graph. Otherwise you'll just get an empty graph.

SEE ALSO

RDF::Trine, RDF::Query, RDF::RDFa::Parser.

http://www.perlrdf.org/.

AUTHOR

Toby Inkster, <tobyink@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009 by Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.1 or, at your option, any later version of Perl 5 you may have available.