The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::eXistDB::RPC - access eXist databases via RPC

DESCRIPTION

This module is a full implementation of the fXML-RPC interface to the eXist Database. This is not just an one-on-one implementation: some methods are smarter and many methods are renamed to correct historical mistakes. Hopefully, the result is more readible.

warning: some methods are tested lightly, but a lot is not tested in real-life. I have a long list of bugs for eXist 1.4, and hope that they will get fixed in a next release. Please do not be disappointed: contribute tests and fixes!

warning: be careful when upgrading to release until 0.90, because they may change method behavior and naming, See ChangeLog!

Perl interface

The methods in this module provide access to all facilities the XML-RPC protocol interface offers. However, some of these calls are on a lower level than practical in a programmers interface. A few larger wrapper methods were created, most importantly uploadDocument() and downloadDocument().

Some defaults can be set at initiation (new()), such that repetition can be avoided.

Definitions

The whole database (Repository) contains sub-databases (Collections), which can have sub-collections themselves. Any collection contains Documents (indexable XML) and Binaries (raw data). When both documents and binaries are accepted, we speak about a Resource.

Naming convensions

The XML-RPC method names are a mess: an typical example of many years of growth. To repair that, consistent naming convensions are introduced.

Any method describeXXX() collects a HASH with details about XXX. And any listXXX() collects a list of XXX names. The typical Java get prefixes on some methods were removed in favor of better named alternatives: sometimes list, sometimes describe, often something completely different. Class attribute getters and setters naming should not be used in interfaces (and are very not-Perl).

Most methods already had the form "<action><class>" (like "removeCollection"), but on some random spots, the "class" was not present in the name. This has been repaired, which lowers the need to read the explanation of the methods to understand what they are doing.

Return codes

RPC is a network protocol. Just like operating system calls: you shall always check the return status of each call! Of course, this module could simply ignore the existence of fault conditions, to provide a much simpler programmers interface. But keep in mind: handling error conditions is very important on the long run. A burdon for the first small programs, but a desperate need for maintainability.

All methods return a LIST, where the first scalar is a return code (RC). When that code is 0, all went well. Otherwise, the code represent the transport error or the exception (refusal) as reported by the server logic. In either case, the second scalar in the returned list contains the error message. For instance,

  my $user = guest;
  my ($rc, $details) = $db->describeUser($user);
  $rc==0
      or die "cannot get user info for `$user': $details ($rc)\n";

METHODS

Constructors

XML::eXistDB::RPC->new(%options)

You must either specify your own XML::Compile::RPC::Client object with the rpc option, or a destination which will be used to create such object.

 -Option            --Default
  chunk_size          32
  compress_upload     128
  destination         <undef>
  format              []
  password            'guest'
  prettyprint_upload  <false>
  repository          '/db'
  rpc                 <undef>
  schemas             <created>
  user                'guest'
chunk_size => KILOBYTES

Send or download data in chunks (fragments) of this size when the size exceeds this quantity. If 0, then chunking is disabled.

compress_upload => KILOBYTES

Compress the upload of resources when their size is over this number of KILOBYTES in size. This will cost performance mainly on the client.

destination => URI

Where the RPC server is (the ExistDB access point). For instance http://localhost:8080/exist/xmlrpc

format => ARRAY|HASH

The default for "options" which can be passed with many methods.

password => STRING
prettyprint_upload => BOOLEAN
repository => STRING

The repository; the top-level collection.

rpc => OBJECT
schemas => OBJECT

When you need to do complex things with the eXist schema's, you may prepare an XML::eXistDB object beforehand. However, that shouldn't be needed under normal cicumstances. By default, such object is created for you.

user => USERNAME

Used as default when a username is required. For now, that is only used by lockResource().

Helpers

$obj->schemas()

Returns the XML::eXistDB object which contains all eXistDB specific schema information. At first call, the object will get created for you. Once created, you'll always get the same.

$obj->trace()

Returns the trace information from the last command executed over RPC. Nearly all methods in this class only perform one RPC call. You can find the timings, http request, and http response in the returned HASH.

Format

A number of methods support formatting options, to control the output. With the method call, these parameters can be passed as list with pairs.

 indent:  returns indented pretty-print XML.         yes|no
 encoding: character encoding used for the output.   <string>
 omit-xml-declaration: XML declaration to the head.  yes|no
 expand-xincludes: expand XInclude elements.         yes|no
 process-xsl-pi: apply stylesheet to the output.     yes|no
 highlight-matches: show result from fulltext search.elements|attributes|both
 stylesheet: to apply. rel-path from database        <path>
 stylesheet-params: stylesheet params                <HASH>

The use of the "stylesheet-params" is simplified compared to the official XML-RPC description, with a nested HASH.

Sending XML

Some method accept a DOCUMENT which can be a XML::LibXML::Document node, a string containing XML, a SCALAR (ref-string) with the same, or a filename.

Repository

$obj->backup($user, $password, $tocoll, $fromcoll)

Returns success. Create a backup of the $fromcoll into the $tocoll, using $user and $password to write it. There is also an Xquery function to produce backups.

example:

  my ($rc, $ok) = $db->backup('sys', 'xxx', '/db/orders', '/db/backup');
  $rc==0 or die "$rc $ok";
$obj->hasCollection($collection)

Does the $collection identified by name exist in the repository?

example:

  my ($rc, $exists) = $db->hasCollection($name);
  $rc and die "$exists (RC=$rc)";
  if($exists) ...
$obj->hasDocument($docname)

Returns whether a document with NAME exists in the repository.

example:

  my ($rc, $exists) = $db->hasDocument($name);
  if($rc==0 && $exists) ....
$obj->isXACMLEnabled()

Returns whether the eXtensible Access Control Markup Language (XACML) by OASIS is enabled on the database.

example:

  my ($rc, $enabled) = $db->isACMLEnabled;
  if(!$rc && $enable) { ... }
$obj->shutdown( [$delay] )

Shutdown the database. The $delay is in milliseconds.

example:

  my ($rc, $success) = $db->shutdown(3000);  # 3 secs
  $rc==0 or die "$rc $success";
$obj->sync()

Force the synchronization of all db page cache buffers.

example:

  my ($rc, $success) = $db->sync;

Collections

$obj->collectionCreationDate( [$collection] )

[non-API] Returns the date of the creation of the $collection, by default from the root.

example:

  my ($rc, $date) = $db->collectionCreationDate($coll);
  $rc and die "$rc $date";
  print $date;  # f.i. "2009-10-21T12:13:13Z"
$obj->configureCollection($collection, $configuration, %options)

The $configuration is a whole .xconfig, describing the collection. This can be a XML::LibXML::Document node, a stringified XML document, or a HASH.

When the $configuration is a HASH, the data will get formatted by XML::eXistDB::createCollectionConfig().

The configuration will be placed in /db/system/config/$collection, inside the database.

 -Option  --Default
  beautify  <new(prettyprint_upload)>
beautify => BOOLEAN

Produce a readible configuration file.

example:

  my %index1   = (path => ..., qname => .., type => ...);
  my @indexes  = (\%index1, \%index2, \%index3);
  my %fulltext = (default => 'none', attributes => 0, alphanum => 0);
  my %trigger1 = (parameter => [ {name => 'p1', value => '42'} ];
  my @triggers = (\%trigger1, \%trigger2);

  my %config   =
    ( index      => {fulltext => \%fulltext, create => \@indexes}
    , triggers   => {trigger  => \@triggers};
    , validation => {mode     => 'yes'}
    );

  my ($rc, $success) = $db->configureCollection($name, \%config);
$obj->copyCollection( $from, $to | <$tocoll, $subcoll> )

Copy the $from collection to a new $to. With three arguments, $subcoll is a collection within $tocoll.

example:

  my ($rc, $succ) = $db->copyCollection('/db/from', '/db/some/to');
  my ($rc, $succ) = $db->copyCollection('/db/from', '/db/some', 'to');
$obj->createCollection( $collection, [$date] )

Is a success if the collection already exists or can be created.

example: createCollection

  my $subcoll = "$supercoll/$myname";
  my ($rc, $success) = $db->createCollection($subcoll);
  $rc==0 or die "$rc $success";
$obj->describeCollection( [$collection], %options )

Returns the RC and a HASH with details. The details are the same as returned with getCollectionDesc(), excluding details about documents.

 -Option   --Default
  documents  <false>
documents => BOOLEAN

example:

  my ($rc, $descr) = $db->describeCollection($coll, documents => 1);
  $rc and die $rc;
  print Dumper $descr;  # Data::Dumper::Dumper
$obj->listResources( [$collection] )

[non-API] Returns ... with all documents in the $collection. Without $collection, it will list all documents in the whole repository.

example:

  my ($rc, @elems) = $db->listResources;
  $rc==0 or die "error: $elems[0] ($rc)";
$obj->moveCollection( $from, $to | <$tocoll, $subcoll> )

Copy the $from collection to a new $to. With three arguments, $subcoll is a collection within $tocoll.

example:

  my ($rc, $succ) = $db->moveCollection('/db/from', '/db/some/to');
  my ($rc, $succ) = $db->moveCollection('/db/from', '/db/some', 'to');
$obj->reindexCollection($collection)

Reindex all documents in a certain collection.

example:

   my ($rc, $success) = $db->reindexCollection($name);
   die "error: $success ($rc)" if $rc;
   die "failed" unless $success;
$obj->removeCollection($collection)

Remove an entire collection from the database.

example:

   my ($rc, $success) = $db->removeCollection($name);
   die "error: $rc $success" if $rc;
   die "failed" unless $success;
$obj->subCollections( [$collection] )

[non-API] Returns a list of sub-collections for this collection, based on the results of describeCollection(). The returned names are made absolute.

example:

  my ($rc, @subs) = $db->subCollections($coll);
  $rc and die "$rc $subs[0]";
  print "@subs\n";

Permissions

$obj->describeCollectionPermissions( [$collection] )

Returns the RC and a HASH which shows the permissions on the $collection. The output of the API is regorously rewritten to simplify implementation.

The HASH contains absolute collection names as keys, and then as values a HASH with user, group and mode.

$obj->describeResourcePermissions($resource)

[non-API] returns HASH with permission details about a $resource>

$obj->describeUser($username)

[non-API] returns a HASH with user information.

example:

  my ($rc, $info) = $db->describeUser($username);
  $rc==0 or die "error: $info ($rc)";
  my @groups = @{$info->{groups}};
$obj->listDocumentPermissions( [$collection] )

List the permissions for all resources in the $collection

$obj->listGroups()

[non-API] list all defined groups. Returns a vector.

example:

  my ($rc, @groups) = $db->listGroups;
  $rc==0 or die "$groups[0] ($rc)";
$obj->listUsers()

[non-API] Returns a LIST with all defined usernames.

example:

  my ($rc, @users) = $db->listUsers;
  $rc==0 or die "error $users[0] ($rc)";
$obj->login( $username, [$password] )

[non-API] Change the $username (as known by ExistDB). When you specify a non-existing $username or a wrong $password, you will not get more data from this connection. The next request will tell.

$obj->removeUser($username)

Returns true on success.

$obj->setPermissions( $target, $permissions, [$user, $group] )

The $target which is addressed is either a resource or a collection.

The $permissions are specified either as an integer value or using a modification string. The bit encoding of the integer value corresponds to Unix conventions (with 'x' is replaced by 'update'). The modification string has as syntax: [user|group|other]=[+|-][read|write|update][, ...]

$obj->setUser( $user, $password, $groups, [$home] )

Modifies or creates a repository user. The $password is plain-text password. $groups are specified as single scalar or and ARRAY. The first group is the user's primary group.

Resources

$obj->copyResource($from, $tocoll, $toname)

example:

  my ($rc, $success) = $db->copyResource(...);
$obj->countResources( [$collection] )

[non-API] Returns the number of resources in the $collection.

example:

  my ($rc, $count) = $db->countResources($collection);
$obj->describeResource($resource)

Returns details about a $resource (which is a document or a binary).

example:

  my ($rc, $details) = $db->describeResource($resource);
$obj->getDocType($document)

Returns details about the $document, the docname, public-id and system-id as list of three.

example:

  my ($docname, $public, $system) = $db->getDocType($doc);
$obj->lockResource( $resource, [$username] )
$obj->moveResource($from, $tocoll, $toname)

example:

  my ($rc, $success) = $db->moveResource(...);
$obj->removeResource($docname)

[non-API] remove a DOCument from the repository by NAME. This method's name is more consistent than the official API name remove().

$obj->setDocType($document, $typename, $public_id, $system_id)

Add DOCTYPE information to a $document.

example:

  $rpc->setDocType($doc, "HTML"
     , "-//W3C//DTD HTML 4.01 Transitional//EN"
     , "http://www.w3.org/TR/html4/loose.dtd");

Will add to the document

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">
$obj->uniqueResourceName( [$collection] )

Produces a random (and hopefully unique) resource-id (string) within the $collection. The returned id looks something like fe7c6ea4.xml.

example:

  my ($rc, $id) = $db->uniqueResourceName($coll);
$obj->unlockResource($resource)

Returns its success.

$obj->whoLockedResource($resource)

[non-API] Returns a username.

Download documents

$obj->downloadDocument($resource, $format)

Returns a document as byte array.

$obj->listResourceTimestamps($resource)

[non-API] Returns the creation and modification dates.

example:

   my ($rc, $created, $modified) = $db->listResourceTimestamps($resource);
   $rc==0 or die "error: $created ($rc)";

Upload documents

$obj->downloadBinary($resource)

[non-API] Get the bytes of a binary file from the server.

example:

  my ($rc, $bytes) = $db->downloadBinary($resource);
$obj->uploadBinary( $resource, $bytes, $mime, $replace, [$created, $modified] )

[non-API] The $bytes can be passed as string or better as string reference.

example:

  my ($rc, $ok) = $db->storeBinaryResource($name, $bytes, 'text/html', 1);
$obj->uploadDocument($resource, $document, %options)

[non-API] Hide all the different kinds of uploads via parse() or upload() behind one interface.

It depends on the size of the document and the type of DATA provided, whether upload(), uploadCompressed(), or parse() is used to transmit the data to the server.

 -Option       --Default
  beautify       <false>
  chunk_size     <new(chunk_size)>
  compress       <new(compress_upload)>
  creation_date  <undef>
  is_xml         <false>
  mime_type      'text/xml'
  modify_date    <undef>
  replace        <false>
beautify => BOOLEAN
chunk_size => KILOBYTES
compress => KILOBYTES
creation_date => DATE
is_xml => BOOLEAN # treatAsXML
mime_type => STRING
modify_date => DATE
replace => BOOLEAN

Queries

Compiled queries

$obj->compile($query, $format)

Returns a HASH.

$obj->describeCompile($query, $format)

[non-API] Returns a string which contains the diagnostics of compiling the query.

$obj->execute($queryhandle, $format)

Returns a HASH.

Query returns result as set

$obj->describeResultSet($resultset)

[non-API] Retrieve a summary of the result set identified by it's result-set-id. This method returns a HASH with simple values queryTime (milli-seconds) and hits (number of results). Besides, it contains complex structures documents and doctypes.

$obj->executeQuery( $query, [$encoding], [$format] )

Run the $query given in the specified $encoding. Returned is only an identifier to the result.

example:

   my ($rc1, $set)   = $db->executeQuery($query);
   my ($rc2, $count) = $db->numberOfResults($set);
   my ($rc3, @data)  = $db->retrieveResults($set);
   $db->releaseResults($set);
$obj->numberOfResults($resultset)

[non-API] Returns the number of answers in the RESULT set of a query. Replaces getHits().

$obj->releaseResultSet( $resultset, [$params] )

[non-API] Give-up on the $resultset on the server.

$obj->retrieveResult( $resultset, $pos, [$format] )

[non-API] retrieve a single result from the RESULT-SET. Replaces retrieve() and retrieveFirstChunk().

$obj->retrieveResults( $resultset, [$format] )

Replaces retrieveAll() and retrieveAllFirstChunk().

Query returns result

$obj->query( $query, $limit, [$first], [$format] )

Returns a document of the collected results.

This method is deprecated according to the java description, in favor of executeQuery(), however often used for its simplicity.

$obj->queryXPath($xpath, $docname, $node_id, %options)

When DOCUMENT is defined, then the search is limited to that document, optionally further restricted to the NODE with the indicated ID.

example:

  my ($rc, $h) = $db->queryXPath($xpath, undef, undef);

Simple node queries

$obj->retrieveDocumentNode( $document, $nodeid, [$format] )

[non-API] Collect one node from a certain document. Doesn't matter how large: this method will always work (by always using chunks).

Modify document content

$obj->updateCollection($collection, $xupdate)

[non-API]

example:

  my ($rc, $some_int) = $db->updateCollection($coll, $xupdate);
$obj->updateResource( $resource, $xupdate, [$encoding] )

example:

  my ($rc, $some_int) = $db->updateResource($resource, $xupdate);

Indexing

$obj->getIndexedElements($collection, $recursive)
$obj->scanIndexTerms($collection, $begin, $end, $recursive)

or $db->scanIndexTerms(XPATH, $begin, $end).

example:

  my ($rc, $details) = $db->scanIndexTerms($xpath, $begin, $end);
  my ($rc, $details) = $db->scanIndexTerms($coll, $begin, $end, $recurse);

Helpers

Please avoid

Some standard API methods have gotten more powerful alternatives. Please avoid using the methods described in this section (although they do work)

Please avoid: collections

$obj->getCollectionDesc( [$collection] )

Please use describeCollection() with option documents => 0.

Please avoid: download documents

$obj->getDocument( $resource, $format|<$encoding, $pretty, $style> )

Please use downloadDocument(). Either specify $format parameters (a list of pairs), or three arguments. In the latter case, the $style must be present but may be undef. $style refers to a stylesheet document.

$obj->getDocumentAsString( $resource, $format|<$encoding, $pretty, $style> )

Please use downloadDocument(). See getDocument().

$obj->getDocumentData($resource, $format)

Please use downloadDocument(). Retrieve the specified document, but limit the number of bytes transmitted to avoid memory shortage on the server. The size of the chunks is controled by the server. Returned is a HASH.

When the returned HASH contains supports-long-offset, then get the next Chunk with getNextExtendedChunk() otherwise use getNextChunk().

example:

   my ($rc, $chunk) = $db->getDocumentData($resource);
   my $doc = $chunk->{data};
   while($rc==0 && $chunk->{offset}!=0)
   {   ($rc, $chunk) = $chunk->{'supports-long-offset'}
       ? $db->getNextExtendedChunk($chunk->{handle}, $chunk->{offset})
       : $db->getNextChunk($chunk->{handle}, $chunk->{offset});
       $rc==0 and $doc .= $chunk->{data};
   }
   $rc==0 or die "error: $chunk ($rc)";
$obj->getNextChunk($tmpname, $offset)

Collect the next chunk, initiated with a getDocumentData(). The file is limited to 2GB.

$obj->getNextExtendedChunk($tmpname, $offset)

Collect the next chunk, initiated with a getDocumentData(). This method can only be used with servers which run an eXist which supports long files.

Please avoid: uploading documents

$obj->parse( $document, $resource, [$replace, [$created, $modified]] )

Please use uploadDocument(). Store the $document of a document under the $resource name into the repository. When $replace is true, it will overwrite an existing document when it exists.

The DATA can be a string containing XML or XML::LibXML::Document.

$obj->parseLocal( $tempname, $resource, $replace, $mime, [$created, $modified] )

Please use uploadDocument(). Put the content of document which was just oploaded to the server under some $tempname (received from upload()), as $resource in the database.

NB: Local means "server local", which is remote for us as clients.

$obj->parseLocalExt( $tempname, $resource, $replace, $mime, $isxml, [$created, $modified] )

Please use uploadDocument(). Put the content of document which was just oploaded with upload() to the server under some $tempname (received from upload()) as $resource in the database. Like parseLocal(), but with extra $isxml boolean, to indicate that the object is XML, where the server does not know that from the mime-type.

NB: Local means "server local", which is remote for us as clients.

$obj->storeBinary( $bytes, $resource, $mime, $replace, [$created, $modified] )

Please use uploadBinary().

$obj->upload( [$tempname], $chunk )

Please use uploadDocument(). Upload a document in parts to the server. The first upload will give you the TEMPoraryNAME for the object. You may leave that name out or explicitly state undef at that first call. When all data is uploaded, call parseLocal() or parseLocalExt().

example:

   # start uploading
   my ($rc1, $tmp)  = $db->upload(undef, substr($data, 0, 999));
   my ($rc1, $tmp)  = $db->upload(substr($data, 0, 999));  # same

   # send more chunks
   my ($rc2, undef) = $db->upload($tmp,  substr($data, 1000));

   # insert the document in the database
   my ($rc3, $ok)   = $db->parseLocal($tmp, '/db/file.xml', 0, 'text/xml')
      if $rc1==0 && $rc2==0;
$obj->uploadCompressed( [$tempname], $chunk )

Please use uploadDocument(). Like upload(), although the chunks are part of a compressed file.

Please avoid: simple node queries

$obj->retrieveFirstChunk( <($doc, $nodeid) | ($resultset, $pos)>, [$format] )

Please use retrieveDocumentNode() or retrieveResult(). Two very different uses for this method: either retrieve the first part of a single node from a document, or retrieve the first part of an answer in a result set. See getNextChunk() for the next chunks.

Please avoid: collect query results

$obj->getDocumentChunked($docname, %options)

Please use downloadDocument()

example:

   my ($rc, $handle, $total_length) = $db->getDocumentChuncked($doc);
   my $xml = $db->getDocumentNextChunk($handle, 0, $total_length-1);
$obj->getDocumentNextChunk($handle, $start, $length)
$obj->initiateBackup($directory)

Trigger the backup task to write to the $directory. Returns true, always, but that does not mean success: the initiation will succeed.

$obj->isValidDocument($document)

Returns true when the $document (inside the database) is validated as correct.

$obj->retrieve( <($doc, $nodeid) | ($resultset, $pos)>, [$format] )

Please use retrieveResult() or retrieveDocumentNode().

$obj->retrieveAll( $resultset, [$format] )

Please use retrieveResults().

$obj->retrieveAllFirstChunk( $resultset, [$format] )

Please use retrieveResults().

$obj->retrieveAsString($document, $nodeid, %options)

Renamed methods

Quite a number of API methods have been renamed to be more consistent with other names. Using the new names should improve readibility. The original names are still available:

  -- xml-rpc name           -- replacement name
  createResourceId          => uniqueResourceName
  dataBackup                => initiateBackup
  getBinaryResource         => downloadBinary
  getCreationDate           => collectionCreationDate
  getDocumentListing        => listResources
  getGroups                 => listGroups
  getHits                   => numberOfResults
  getPermissions            => describeResourcePermissions
  getResourceCount          => countResources
  getTimestamps             => listResourceTimestamps
  getUser                   => describeUser
  getUsers                  => listUsers
  hasUserLock               => whoLockedResource
  isValid                   => isValidDocument
  listCollectionPermissions => describeCollectionPermissions
  printDiagnostics          => describeCompile
  queryP                    => queryXPath
  querySummary              => describeResultSet
  releaseQueryResult        => releaseResultSet
  remove                    => removeResource
  xupdate                   => xupdateCollection
  xupdateResource           => xupdateResource

SYNOPSYS

  my $db = XML::eXistDB::RPC->new(destination => $uri);
  my ($rc1, $h) = $db->describeUser('guest');
  $rc1==0 or die "Error: $h\n";

  my ($rc2, $set) = $db->executeQuery($query);
  my ($rc3, @answers) = $db->retrieveResults($set);

SEE ALSO

This module is part of XML-ExistDB distribution version 0.14, built on July 25, 2015. Website: http://perl.overmeer.net/xml-compile/

Other distributions in this suite: XML::Compile, XML::Compile::SOAP, XML::Compile::WSDL11, XML::Compile::SOAP12, XML::Compile::SOAP::Daemon, XML::Compile::SOAP::WSA, XML::Compile::C14N, XML::Compile::WSS, XML::Compile::WSS::Signature, XML::Compile::Tester, XML::Compile::Cache, XML::Compile::Dumper, XML::Compile::RPC, XML::Rewrite and XML::LibXML::Simple.

Please post questions or ideas to the mailinglist at http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/xml-compile . For live contact with other developers, visit the #xml-compile channel on irc.perl.org.

LICENSE

Copyrights 2010-2015 by [Mark Overmeer]. For other contributors see ChangeLog.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html