The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

WWW::Odeon - A simple API for screen-scraping the www.odeon.co.uk website

SYNOPSIS

 # Procedural interface

 use WWW::Odeon;

 my @regions = get_regions();
 my @cinemas = get_cinemas( $regions[2] );
 my $details = get_details( $cinemas[4] );

 my @dates = keys %$details;
 foreach my $day ( @dates ) {
   my @films = keys %{ $details->{$day} };
   foreach my $film ( @films ) {
     while ( my ( $showing, $availability ) = each %{ $details->{$day}->{$film} } ) {
       print "Film '$film' is $availability at $showing on $day\n";
     }
   }
 }

 # Object-oriented interface

 use WWW::Odeon ();

 my $odeon = new WWW::Odeon;
 $odeon->cache_time( 30 );

 my $regions = $odeon->regions;
 my $cinemas = $odeon->cinemas( $regions->[2] );
 my $details = $odeon->details( $cinemas->[4] );

 # Or directly access film data if you know the cinema name
 print "The following films are on at Odeon Leicester Square:\n";
 print join( "\n", $odeon->films('Leicester Square') );
 print "There is information about the following dates for Odeon York:\n";
 print join( "\n", $odeon->dates('York') )
 
 @showtimes = $odeon->availability( $cinema, $film, $day );

DESCRIPTION

This module allows data about films showing at Odeon cinemas in the United Kingdom to be retrieved. The only prerequisite is LWP::Simple -- and a connection to the web!

To fully use this module it is necessary to understand the hierarchy by which Odeon UK structures its film information. The country is divided into regions, each region contains multiple cinemas, each cinema shows several films, at various times, for several days. This structure is represented in the module API.

Procedural Interface

get_regions

Retrieves a list of all the available regions. Unless Odeon UK radically change their systems then this is likely to remain static; the regions are currently:

Central_London, Channel_Islands, Greater_London, Midlands, North_East_England, North_West_England, Scotland, South_East_England, South_West_England, Wales

Note that regions made up of multiple words use underscores (as shown) not spaces, so for display it would be recommended to do a tr/_/ / for each item in the array. The order in which regions are retrieved is not guaranteed: in particular, it is not guaranteed to be alphabetical order.

If the attempt to retrieve the data fails, an empty list is returned.

get_cinemas( $region )

Retrieves a list of all cinemas for a given region. The region name should be identical to that returned by the get_regions() subroutine.

If the attempt to retrieve the data fails, an empty list is returned.

get_details( $cinema )

Retrieve the film and showing details for a given cinema. Returns a reference to a hash containing the data for the cinema requested, which should be a string identical to one returned by the get_cinemas() subroutine.

The hashref points at a hash which has the following general structure:

 $details{ $day => { $title => { $time => $availability } } }

In other words, if $details is the hashref returned by get_details(), then:

keys %$details will be a list of dates, keys %{$details->{DATE_FROM_LIST}} will be a list of film titles, and keys %{$details->{DATE_FROM_LIST}->{FILM_TITLE}}} will be a list of film times. For each time the value will be either 'available' or 'sold out' depending on whether or not any tickets are still available for purchase.

Object-Oriented Interface

There is one important difference to be aware of between the proecdural functions and OO methods supplied by this module. Whereas the procedural functions get_regions() and get_cinemas() return lists, the equivalent OO methods $odeon-regions()> and $odeon-cinemas()> return references to arrays. Don't get caught out!

new()

Creates and returns a new WWW::Odeon object.

cache_time( $minutes )

The object-oriented API can cache the data it retrieves from the Odeon website. This has two advantages: firstly it means that subsequent requests for the same data are returned much faster, as there is no need to make an HTTP request, and for same reason it uses less bandwidth and puts less strain on the www.odeon.co.uk website, which is also a good thing.

You can specify the length of time that cached data is valid for using this method. Data that is older than the cache time will be automatically refreshed the next time it is requested.

Due to the fact that cinema programmes are fairly static, it is recommended that quite a long cache time is used. At a minimum a cache time of 60 (1 hour) should be used, and for most purposes even longer, up to 240 or 480 minutes will be perfectly sufficient.

flush_cache()

It may occasionally be useful to flush the entire cache that the object has built up over time, such as in a long-running program that wants to ensure data is refreshed twice daily to reflect any updates made by Odeon. This method will achieve that goal.

cached()

Simply returns 1 if the last method called retrieved cached data, 0 otherwise. Might be useful for analysis of cacheing performance.

regions()

Analogous to get_regions() in the procedure-oriented interface, this method returns a reference to an array of regions. Cached data will be returned when available.

cinemas( $region )

Analogous to get_cinemas() in the procedure-oriented interface, this method returns a reference to an array of cinema names for the specified region. Cached data will be returned when available.

details( $cinema )

Analogous to get_details() in the procedure-oriented interface, this method returns a reference to a hash containing the details for the specified cinema. Cached data will be returned when available.

See the get_details() function from the procedure-oriented interface for a description of the format of the returned data structure.

films( $cinema )

This will return a list of film titles that are showing at the specified cinema, using cached data when available.

Note that it is not necessary to load in the list of regions and/or cinemas beforehand, as long as you know the exact cinema title (as used by www.odeon.co.uk). So the following example is a valid perl one-liner:

 perl -MWWW::Odeon -le '$o=new WWW::Odeon;print join "\n", $o->films("Leicester Square")'
dates( $cinema )

This will return a list of dates for which there is film information available for the specified cinema. This method will use cached data where available. Note that the dates are in the format used on the Odeon website, which is day DD-MM-YYYY.

It is not necessary to load in the list of regions and/or cinemas beforehand, as long as you know the exact cinema title (as used by www.odeon.co.uk).

availability( $cinema, $film, $day )

Returns a sorted list of times (in format HH:MM) when the specified film is showing at the specified cinema and date. Cached data will be used when available.

The caveats here the same as with the other methods in this class: data MUST be supplied in the format that is recognised by www.odeon.co.uk, so that means the cinema name must match one returned by cinemas() and the film and requested day need to be in the formats mentioned above.

This method can be called without any need to pre-load the list of regions and/or cinemas.

BACKGROUND

This module provides simple procedure- and object-oriented APIs for accessing film times from the Odeon UK website at http://www.odeon.co.uk/. It was inspired by Matthew Somerville's Accessible Odeon website at http://www.dracos.co.uk/odeon/ which was closed down in July 2004 after receiving "Cease and Desist"-type notifications from Odeon's lawyers.

As the official Odeon site is extremely poorly written -- it requires Microsoft Internet Explorer 5.x or 6.x to work, and then only if Javascript is enabled -- it was felt that there was a strong community requirement for a continuation of an accessible form of the website. This module allows Odeon's data to be retrieved so that it can be displayed by a CGI or command-line program, and the hope that if an open source solution is available to the community then Odeon will not be able to keep on shutting down each and every "Accessible" front-end to their site, or ideally, will create a working version of their own site.

COPYRIGHT

Copyright 2004 Iain Tatch <iaint@cpan.org>

This software is distributed under the Artistic Licence, a copy of which should accompany the distribution, or if not, can be found on the Web at http://www.opensource.org/licenses/artistic-license.php