NAME
Net::OAI::Record::NamespaceFilter - general filter class based on namespace URIs
SYNOPSIS
$plug
= Net::OAI::Record::NamespaceFilter->new();
# Noop
$multihandler
= Net::OAI::Record::NamespaceFilter->new(
);
$saxfilter
= new SOME_SAX_Filter;
...
$filter
= Net::OAI::Record::NamespaceFilter->new(
'*'
=>
$saxfilter
,
# '*' for any namespace
);
$filter
= Net::OAI::Record::NamespaceFilter->new(
'*'
=>
sub
{
my
$x
=
""
;
return
XML::SAX::Writer->new(
Output
=> \
$x
);
};
);
DESCRIPTION
It will forward any element belonging to a namespace from this list to the associated SAX filter and all of the element's children (regardless of their respective namespace) to the same one. It can be used either as a metadataHandler
or recordHandler
.
This SAX filter takes a hashref namespaces
as argument, with namespace URIs for keys ('*' for "any namespace") and the values are either
- undef
-
Matching elements and their subelements are suppressed.
If the list of namespaces ist empty or
undefined
is connected to the filter, it effectively acts as a plug to Net::OAI::Harvester. This might come handy if you are planning to get to the raw result by other means, e.g. by tapping the user agent or accessing the result's xml() method:$plug
= Net::OAI::Record::NamespaceFilter->new();
$harvester
= Net::OAI::Harvester->new( [
baseURL
=> ...,
] );
$tapped_by_ua
=
""
;
open
(
$TAP
,
">"
, \
$tapped_by_ua
);
$harvester
->userAgent()->add_handler(
response_data
=>
sub
{
my
(
$response
,
$ua
,
$h
,
$data
) =
@_
;
print
$TAP
$data
;
});
$list
=
$harvester
->listRecords(
metadataPrefix
=>
'a_strange_one'
,
recordHandler
=>
$plug
,
);
print
$tapped_by_ua
;
# complete OAI response
print
$list
->xml();
# should be exactly the same
Comment: This is quite an efficient way of not processing the XML content of OAI records received.
- a class name of a SAX filter
-
As usual for any record element of the OAI response a new instance is created.
# end_document() of instances of MyWriter returns something meaningful...
$consumer
= Net::OAI::Record::NamespaceFilter->new(
'*'
=>
'MyWriter'
);
$filter
= Net::OAI::Record::NamespaceFilter->new(
'*'
=>
$consumer
);
$list
=
$harvester
->listAllRecords(
metadataPrefix
=>
'oai_dc'
,
recordHandler
=>
$filter
,
);
while
(
$r
=
$list
->
next
() ) {
next
if
$r
->status() eq
"deleted"
;
$xmlstringref
=
$r
->recorddata()->result(
'*'
);
...
};
Note: The handlers are instantiated for each single OAI record in the response and will see one start_document() and end_document() event in any case (this behavior is different from that of handler class names directly specified as
metadataHandler
orrecordHandler
for a request: instances from those constructions will never see such events). - a code reference for an constructor
-
Must return a SAX filter ready to accept a new document.
The following example returns a string serialization for each single record:
# end_document() events will return \$x
$constructor
=
sub
{
my
$x
=
""
;
return
XML::SAX::Writer->new(
Output
=> \
$x
);
};
$filter
= Net::OAI::Record::NamespaceFilter->new(
'*'
=>
$constructor
);
$list
=
$harvester
->listRecords(
metadataPrefix
=>
'oai_dc'
,
recordHandler
=>
$filter
,
);
while
(
$r
=
$list
->
next
() ) {
$xmlstringref
=
$r
->recorddata()->result(
'*'
);
...
};
Comment: This example shows an approach to insulate the "true contents" of individual response records without having to provide a SAX handler class of one's own (just the addidtional prerequisite of XML::SAX::Writer). But what you get is a serialized XML document which then has to be parsed for further processing ...
- an already instantiated SAX filter
-
As usual in this case no
start_document()
andend_document()
events are forwarded to the filter.open
$fh
,
">"
,
$some_file
;
$builder
= XML::SAX::Writer->new(
Output
=>
$fh
);
$builder
->start_document();
$rootEL
= {
Name
=>
'collection'
,
LocalName
=>
'collection'
,
Prefix
=>
""
,
Attributes
=> {}
};
$builder
->start_element(
$rootEL
);
# filter for OAI-Namespace in records: forward all
$filter
= Net::OAI::Record::NamespaceFilter->new(
$list
=
$harvester
->listRecords(
metadataPrefix
=>
'a_strange_one'
,
metadataHandler
=>
$filter
,
);
# handle resumption tokens if more than the first
# chunk shall be stored into $fh ....
$builder
->end_element(
$rootEL
);
$builder
->end_document();
close
(
$fh
);
# ... process contents of $some_file
In this example calling the
result()
method for individual records in the response will probably not be of much use.
Caution: Depending on the namespaces specified, even a handlers which are freshly instantiated for each OAI record might be fed with more than one top-level XML element.
METHODS
new( [%namespaces] )
Creates a Handler suitable as recordHandler or metadataHandler. %namespaces has namespace URIs for keys and values according to the four types described as above.
result ( [namespace] )
If called with a namespace, it returns the result of the handler, i.e. what end_document()
returned for the record in question. Otherwise it returns a hashref for all the results with the corresponding namespaces as keys.
AUTHOR
Thomas Berger <ThB@gymel.com>