The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::FeedPP -- Parse/write/merge/edit web feeds, RSS/RDF/Atom

SYNOPSIS

Get a RSS file and parse it.

    my $source = 'http://use.perl.org/index.rss';
    my $feed = XML::FeedPP->new( $source );
    print "Title: ", $feed->title(), "\n";
    print "Date: ", $feed->pubDate(), "\n";
    foreach my $item ( $feed->get_item() ) {
        print "URL: ", $item->link(), "\n";
        print "Title: ", $item->title(), "\n";
    }

Generate a RDF file and save it.

    my $feed = XML::FeedPP::RDF->new();
    $feed->title( "use Perl" );
    $feed->link( "http://use.perl.org/" );
    $feed->pubDate( "Thu, 23 Feb 2006 14:43:43 +0900" );
    my $item = $feed->add_item( "http://search.cpan.org/~kawasaki/XML-TreePP-0.02" );
    $item->title( "Pure Perl implementation for parsing/writing xml file" );
    $item->pubDate( "2006-02-23T14:43:43+09:00" );
    $feed->to_file( "index.rdf" );

Merge some RSS/RDF files and convert it into Atom format.

    my $feed = XML::FeedPP::Atom->new();                # create empty atom file
    $feed->merge( "rss.xml" );                          # load local RSS file
    $feed->merge( "http://www.kawa.net/index.rdf" );    # load remote RDF file
    my $now = time();
    $feed->pubDate( $now );                             # touch date
    my $atom = $feed->to_string();                      # get Atom source code

DESCRIPTION

XML::FeedPP module parses a RSS/RDF/Atom file, converts its format, marges another files, and generates a XML file. This module is a pure Perl implementation and do not requires any other modules expcept for XML::FeedPP.

METHODS FOR FEED

$feed = XML::FreePP->new( 'index.rss' );

This constructor method creates a instance of the XML::FeedPP. The format of $source must be one of the supported feed fromats: RSS, RDF or Atom. The first arguments is the file name on the local file system.

$feed = XML::FreePP->new( 'http://use.perl.org/index.rss' );

The URL on the remote web server is also available as the first argument. LWP::UserAgent module is required to download it.

$feed = XML::FreePP->new( '<?xml?><rss version="2.0"><channel>....' );

The XML source code is also available as the first argument.

$feed = XML::FreePP::RSS->new( $source );

This constructor method creates a instance for RSS format. The first argument is optional. This method returns an empty instance when $source is not defined.

$feed = XML::FreePP::RDF->new( $source );

This constructor method creates a instance for RDF format. The first argument is optional. This method returns an empty instance when $source is not defined.

$feed = XML::FreePP::Atom->new( $source );

This constructor method creates a instance for Atom format. The first argument is optional. This method returns an empty instance when $source is not defined.

$feed->load( $source );

This method loads a RSS/RDF/Atom file like new() method do.

$feed->merge( $source );

This method merges a RSS/RDF/Atom file into existing $feed instance.

$string = $feed->to_string( $encoding );

This method generates XML source as string and returns it. The output $encoding is optional and the default value is 'UTF-8'. On Perl 5.8 and later, any encodings supported by Encode module are available. On Perl 5.005 and 5.6.1, four encodings supported by Jcode module are only available: 'UTF-8', 'Shift_JIS', 'EUC-JP' and 'ISO-2022-JP'. But normaly, 'UTF-8' is recommended to the compatibilities.

$feed->to_file( $filename, $encoding );

This method generate a XML file. The output $encoding is optional and the default value is 'UTF-8'.

$item = $feed->get_item( $num );

This method returns item(s) in $feed. If $num is defined, it returns the $num-th item's object. If $num is not defined on array context, it returns a array of all items. If $num is not defined on scalar context, it returns a number of items.

$item = $feed->add_item( $url );

This method creates a new item/entry and returns its instance. First argument $link is the URL of the new item/entry. RSS's <item> element is a instance of XML::FeedPP::RSS::Item class. RDF's <item> element is a instance of XML::FeedPP::RDF::Item class. Atom's <entry> element is a instance of XML::FeedPP::Atom::Entry class.

$item = $feed->add_item( $srcitem );

This method duplicates a item/entery and adds it to $feed. $srcitem is a XML::FeedPP::*::Item class's instance which is returned by get_item() method above.

$feed->remove_item( $num );

This method removes a item/entry from $feed.

$feed->clear_item();

This method removes all items/entries from $feed.

$feed->sort_item();

This method sorts the order of items in $feed by pubDate.

$feed->uniq_item();

This method makes items unique. The second and succeeding items which have a same link URL are removed.

$feed->limit_item( $num );

This method removes items which exceed the limit specified.

$feed->normalize();

This method calls both of sort_item() method and uniq_item() method.

$feed->xmlns( 'xmlns:media' => 'http://search.yahoo.com/mrss' );

This code adds a XML namespace at the document root of the feed.

$url = $feed->xmlns( 'xmlns:media' );

This code returns the URL of the specified XML namespace.

@list = $feed->xmlns();

This code returns the list of all XML namespace used in $feed.

METHODS FOR CHANNEL

$feed->title( $text );

This method sets/gets the feed's <title> value. This method returns the current value when the $title is not defined.

$feed->description( $html );

This method sets/gets the feed's <description> value in HTML. This method returns the current value when the $html is not defined.

$feed->pubDate( $date );

This method sets/gets the feed's <pubDate> value for RSS, <dc:date> value for RDF, or <modified> value for Atom. This method returns the current value when the $date is not defined. See also the DATE/TIME FORMATS section.

$feed->copyright( $text );

This method sets/gets the feed's <copyright> value for RSS/Atom, or <dc:rights> element for RDF. This method returns the current value when the $text is not defined.

$feed->link( $url );

This method sets/gets the URL of the web site as the feed's <link> value for RSS/RDF/Atom. This method returns the current value when the $url is not defined.

$feed->language( $lang );

This method sets/gets the feed's <language> value for RSS, <dc:language> element for RDF, or <feed xml:lang=""> attribute for Atom. This method returns the current value when the $lang is not defined.

$feed->image( $url, $title, $link, $description, $width, $height )

This method sets/gets the feed's <image> value and its child nodes for RSS/RDF. This method is ignored for Atom. This method returns the current values as array when any arguments are not defined.

METHODS FOR ITEM

$item->title( $text );

This method sets/gets the item's <title> value. This method returns the current value when the $text is not defined.

$item->description( $html );

This method sets/gets the item's <description> value in HTML. This method returns the current value when the $text is not defined.

$item->pubDate( $date );

This method sets/gets the item's <pubDate> value for RSS, <dc:date> element for RDF, or <issued> element for Atom. This method returns the current value when the $text is not defined. See also the DATE/TIME FORMATS section.

$item->category( $text );

This method sets/gets the item's <category> value for RSS/RDF. This method is ignored for Atom. This method returns the current value when the $text is not defined.

$item->author( $text );

This method sets/gets the item's <author> value for RSS, <creator> value for RDF, or <author><name> value for Atom. This method returns the current value when the $text is not defined.

This method sets/gets the item's <guid> value for RSS or <id> value for Atom. This method is ignored for RDF. The second argument is optional. This method returns the current value when the $guid is not defined.

$item->set( $key => $value, ... );

This method sets some node values or attributes. See also the next section: GENERAL SET/GET

$value = $item->get( $key );

This method returns the node value or attribute. See also the next section: GENERAL SET/GET

This method returns the item's <link> value.

GENERAL SET/GET

XML::FeedPP understands only <rdf:*>, <dc:*> modules and RSS/RDF/ATOM's default namespaces. There are NO native methods for any other external modules, such as <media:*>. But set()/get() methods are available to get/set the value of any elements or attributes for these modules.

$item->set( 'module:name' => $value );

This code sets the value of the child node: <item><module:name>$value

$item->set( 'module:name@attr' => $value );

This code sets the value of the child node's attribute: <item><module:name attr="$value">

$item->set( '@attr' => $value );

This code sets the value of the item's attribute: <item attr="$value">

$item->set( 'hoge/pomu@hare' => $value );

This code sets the value of the child node's child node's attribute: <item><hoge><pomu attr="$value">

DATE/TIME FORMATS

XML::FeedPP allows you to describe date/time by three formats following:

$date = "Thu, 23 Feb 2006 14:43:43 +0900";

The first format is the format preferred for the HTTP protocol. This is the native format of RSS 2.0 and one of the formats defined by RFC 1123.

$date = "2006-02-23T14:43:43+09:00";

The second format is the W3CDTF format. This is the native format of RDF and one of the formats defined by ISO 8601.

$date = 1140705823;

The last format is the number of seconds since the epoch, 1970-01-01T00:00:00Z. You know, this is the native format of Perl's time() function.

MODULE DEPENDENCIES

XML::FeedPP module requires only XML::TreePP module, which is a pure Perl implementation as well. LWP::UserAgent module is also required to download a file from remote web server. Jcode module is required to convert Japanese encodings on Perl 5.006 and 5.6.1. Jcode module is NOT required on Perl 5.8.x and later.

AUTHOR

Yusuke Kawasaki, http://www.kawa.net/

COPYRIGHT AND LICENSE

Copyright (c) 2006 Yusuke Kawasaki. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.