Peter H. Li

NAME

WebService::Nextbus - A screen scraper useful for propagating the data structure of WebService::Nextbus::Agency.

SYNOPSIS

  use WebService::Nextbus;
  $nb = new WebService::Nextbus;
  $nb->buildAgency('sf-muni'); # Scraping the webpages repeatedly can take time
  @stops = $nb->agencies->{'sf-muni'}->str2stopCodes('N', 'judah', 'Chu Dub');

@stops can now be used as valid GET arguments on the nextbus webpage.

DESCRIPTION

WebService::Nextbus can determine the relevant GET arguments for queries to the Nextbus website (www.nextbus.com) by screen scraping. WebService::Nextbus::Agency implements a basic data structure for storing and retrieving the information gleaned by this screen scraping.

Once the proper GET code has been retrieved, a web useragent can use the argument to build a URL for the desired information. This useragent function will probably eventually be incorporated into WebService::Nextbus.

The screen scraping is done without any additional required HTML parser module. I did this to improve interoperability, but the parsing is therefore necessarily crude and perhaps not as fast as it could be (it uses RegExps rather than a state machine). This shouldn't be a major issue, however; although running the initial screen scraping, with buildAgency for example, can be slow, you should be able to store the results (using Storable for example) and then retrieve them quickly. This should work well since the data don't change all that frequently.

For example:

  # As above (use emery agency for example because it's smaller, faster)
  use WebService::Nextbus;
  $nb = new WebService::Nextbus;
  $nb->buildAgency('emery'); # Scraping the webpages repeatedly can take time

  # Now store the resulting agency, retrieve it, and dump its contents
  use Storable qw(nstore);
  nstore($nb->agencies->{'emery'}, 'emery.store');
  $agency = retrieve('emery.store');
  print $agency->routesAsString;

  # Or store just the routes tree, retrieve it, and dump its contents
  nstore($nb->agencies->{'emery'}->routes, 'emery_routes.store');
  $agency = new WebService::Nextbus::Agency;
  $agency->routes(retrieve('emery_routes.store'));
  print $agency->routesAsString;

EXPORT

None by default; OO interface.

REQUIRES

Requires the LWP::UserAgent module and the WebService::Nextbus::Agency package. Tests require the Test::More module.

AUTHOR

Peter H. Li<lt>phli@cpan.org<gt>

COPYRIGHT

Licensed by Creative Commons http://creativecommons.org/licenses/by-nc-sa/2.0/

SEE ALSO

WebService::Nextbus::Agency, LWP::UserAgent, perl.