XMLTV::Grab_XML - Perl extension to fetch raw XMLTV data from a site


    package Grab_XML_rur;
    use base 'XMLTV::Grab_XML';
    sub urls_by_date( $ ) { my $pkg = shift; ... }
    sub country( $ ) { my $pkg = shift; return 'Ruritania' }
    # Maybe override a couple of other methods as described below...


This module helps to write grabbers which fetch pages in XMLTV format from some website and output the data. It is not used for grabbers which scrape human-readable sites.

It consists of several class methods (package methods). The way to use it is to subclass it and override some of these.



Called at the start of the program to set up Date::Manip. You might want to override this with a method that sets the timezone.


Returns a hash mapping YYYYMMDD dates to a URL where listings for that date can be downloaded. This method is abstract, you must override it.


Given page data for a particular day, turn it into XML. The default implementation just returns the data unchanged, but you might override it if you need to decompress the data or patch it up.


Bump a YYYYMMDD date by one. You probably shouldn't override this.


Return the name of the country you're grabbing for, used in usage messages. Abstract.


Return a command-line usage message. This calls country(), so you probably need to override only that method.


Given a URL, fetch the content at that URL. The default implementation calls XMLTV::Get_nice::get_nice() but you might want to override it if you need to do wacky things with http requests, like cookies.

Note that while this method fetches a page, xml_from_data() does any further processing of the result to turn it into XML.


The main program. Parse command line options, fetch and write data.

Most of the options are fairly self-explanatory but this routine also calls the XMLTV::Memoize module to look for a --cache argument. The functions memoized are those given by the cachables() method.


Returns a list of names of functions which could reasonably be memoized between runs. This will normally be whatever function fetches the web pages - you memoize that to save on repeated downloads. A subclass might want to add things to this list if it has its own way of fetching web pages.


Ed Avis,


perl(1), XMLTV(3).