The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Syndication::NITF -- Parser for NITF v3.0 documents

VERSION

Version $Revision: 0.2 $, released $Date: 2001/12/19 05:30:13 $

SYNOPSIS

 use Syndication::NITF;

 my $nitf = new Syndication::NITF("myNITFfile.xml");
 my $head = $nitf->gethead;

 my $title = $head->gettitle->getText;

 my $tobject = $head->gettobject;
 if ($tobject->gettobjecttype eq "news") {
   my $items = $tobject->gettobjectsubjectList;
   foreach my $item (@$items) {
     # process each subject header
     ...
   }
 }
 ... etc ...

DESCRIPTION

Syndication::NITF is an object-oriented Perl interface to NITF documents, allowing you to manage (and one day create) NITF documents without any specialised NITF or XML knowledge.

NITF is a standard format for the markup of textual news content (eg newspaper and magazine articles), ratified by the International Press Telecommunications Council (http://www.iptc.org).

This module supports the version 3.0 DTD of NITF. It makes no attempt to support eariler versions of the DTD.

The module code is based on my Syndication::NewsML module, and much of the functionality is shared between the two (well actually it's copied from the NewsML module rather than "shared" properly in the form of a separate module of shared classes -- this may be remedied in the future).

Initialization

At the moment the constructor can only take a filename as an argument, as follows:

  my $nitf = new Syndication::NITF("file-to-parse.xml");

This attaches a parser to the file (using XML::DOM), and returns a reference to the first NITF tag. (I may decide that this is a bad idea and change it soon)

Reading objects

There are five main types of calls:

  • Get an individual element:

      my $head = $nitf->gethead;
  • Return a reference to an array of elements:

      my $identifiedcontentlist = $head->getdocdata->getidentifiedcontentList;

    The array can be referenced as @$identifiedcontentlist, or an individual element can be referenced as $identifiedcontentlist->[N].

  • Return the size of a list of elements:

      my $iclcount = $head->getdocdata->getidentifiedcontentCount;
  • Get an attribute of an element (as text):

      my $href = $catalog->getHref;
  • Get the contents of an element (ie the text between the opening and closing tags):

      my $urlnode = $catalog->getResourceList->[0]->getUrlList->[0];
      my $urltext = $urlnode->getText;

Not all of these calls work for all elements: for example, if an element is defined in the NITF DTD as having zero or one instances in its parent element, and you try to call getXXXList, Syndication::NITF will "croak" an error. (The error handling will be improved in the future so that it won't croak fatally unless you want that to happen)

The NITF standard contains some "business rules" also written into the DTD: for example, a NewsItem may contain nothing, a NewsComponent, one or more Update elements, or a TopicSet. For some of these rules, the module is smart enough to detect errors and provide a warning. Again, these warnings will be improved and extended in future versions of this module.

Documentation for all the classes

Each NITF element is represented as a class. This means that you can traverse documents as Perl objects, as seen above.

Full documentation of which classes can be used in which documents is beyond me right now (with over 120 classes to document), so for now you'll have to work with the examples in the examples/ and t/ directories to see what's going on. You should be able to get a handle on it fairly quickly.

The real problem is that it's hard to know when to use getXXX() and when to use GetXXXList() -- that is, when an element can have more than one entry and when it is a singleton. Quite often it isn't obvious from looking at a NITF document. For now, two ways to work this out are to try it and see if you get an error, or to have a copy of the DTD in front of you. Obviously neither of these is optimal, but documenting all 127 classes just so people can tell this difference is pretty scary as well, and so much documentation would put lots of people off using the module. So I'll probably come up with a reference document listing all the classes and methods, rather than docs for each class, in a future release. If anyone has any better ideas, please let me know.

BUGS

None that I know of, but there are probably many. The test suite isn't complete, so not every method is tested, but the major ones (seem to) work fine. Of course, if you find bugs, I'd be very keen to hear about them at brendan@clueful.com.au.

SEE ALSO

XML::DOM, XML::RSS, Syndication::NewsML

AUTHOR

Brendan Quinn, Clueful Consulting Pty Ltd (brendan@clueful.com.au)

COPYRIGHT

Copyright (c) 2001, Brendan Quinn. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.