The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Geo::Coder::Many - Module to tie together multiple Geo::Coder::* modules

DESCRIPTION

Geo::Coder::Many is a wrapper for multiple Geo::Coder::* modules, based on Geo::Coder::Multiple.

Amongst other things, Geo::Coder::Many adds:

Geocoder precision information
Alternative scheduling methods (weighted random, and ordered list)
Timeouts for geocoders which are failing
Optional callbacks for result filtering and picking

SYNOPSIS

General steps for using Geo::Coder::Many:

1 Create Geo::Coder::* objects for the geocoders you want to use
2 Create the Geo::Coder::Many object
3 Call add_geocoder for each of the geocoders you want to use
4 Set any filter or picker callbacks you require (optional)
5 Use the geocode method to all of do your geocoding

EXAMPLE

  # for Geo::Coder::Jingle and Geo::Coder::Bells
  use Geo::Coder::Jingle;
  use Geo::Coder::Bells;
  use Geo::Coder::Many;
  
  my $options = {
    cache   => $cache_object,
  };

  my $geocoder_multi = Geo::Coder::Many->new( $options );

  my $jingle = Geo::Coder::Jingle->new( apikey => 'Jingle API Key' );

  my $jingle_options = {
    geocoder    => $jingle,
    daily_limit => 25000,
  };

  $geocoder_multi->add_geocoder( $jingle_options );

  my $bells = Geo::Coder::Bells->new( apikey => 'Bells API Key' );

  my $bells_options = {
    geocoder    => $bells,
    daily_limit => 4000,
  };

  $geocoder_multi->add_geocoder( $bells_options );

  my $location = $geocoder_multi->geocode( { location => '82 Clerkenwell Road, London, EC1M 5RF' } );

  if( $location->{response_code} == 200 ) {
    print $location->{address} ."\n";
  };

METHODS

new

Constructs a new Geo::Coder::Many object and returns it. If no options are specified, no caching will be done for the geocoding results.

The 'normalize_code_ref' is a code reference which is used to normalize location strings to ensure that all cache keys are normalized for correct lookup.

The scheduler_type specifies how load balancing should be done. Options currently available are:

WRR (Weighted round-robin)
    Round-robin scheduling, weighted by the daily_limit values for the geocoders
    The same behaviour as Geo::Coder::Multiple
OrderedList
    A strict preferential ordering by daily_limit - the geocoder with the
    highest limit will always be used. If that fails, the next highest will be
    used, and so on.
WeightedRandom
    Geocoders will be picked at random, each with probability proportional to
    its specified daily_limit.

Note: other scheduling options can be implemented by sub-classing Geo::Coder::Many::Scheduler or Geo::Coder::Many::UniquenessScheduler.

If use_timeouts is true, geocoders that are unsuccessful will not be queried again for a set amount of time. The timeout period will increase exponentially for every successive consecutive failure.

  KEY                   VALUE
  -----------           --------------------
  cache                 Cache object reference  (optional)
  normalize_code_ref    A normalization code ref (optional)
  scheduler_type        Name of the scheduler type to use (default: WRR)
  use_timeouts          Whether to time out failing geocoders (default: false)

add_geocoder

This method adds a geocoder to the list of possibilities.

Before any geocoding can be performed, at least one geocoder must be added to the list of available geocoders.

If the same geocoder is added twice, only the instance added first will be used. All other additions will be ignored.

  KEY                   VALUE
  -----------           --------------------
  geocoder              geocoder reference object
  limit                 geocoder source limit per 24 hour period

Example

  my $jingle = Geo::Coder::Jingle->new( apikey => 'Jingle API Key' );
  my $jingle_limit = 25000;

  my $options = {
    geocoder    => $jingle,
    daily_limit => $jingle_limit,
  };

  $geocoder_multi->add_geocoder( $options );

set_filter_callback

Sets the callback used for filtering results. By default, all results are passed through. If a callback is set, only results for which the callback returns true are passed through. The callback takes one argument: a Response object to be judged for fitness. It should return true or false, depending on whether that Response is deemed suitable for consideration by the picker.

set_picker_callback

Sets the callback used for result picking. This determines which single result will actually be returned by the geocode method. By default, the first valid result (that has passed the filter callback, if one was set) is returned.

As an alternative to passing a subroutine reference, you can pass a scalar with a name that refers to one of the built-in callbacks. An empty string or 'first' sets the behaviour back to the default: accept the first result that is offered. 'max_precision' fetches all results and chooses the one with the greatest precision value.

The picker callback has two arguments: a reference to an array of the valid results that have been collected so far, and a value that is true if there are more results available and false otherwise. The callback should return a single result from the list, if one is acceptable. If none are acceptable, the callback may return undef, indicating that more results to pick from are desired. If these are available, the picker will be called again once they have been added to the results array.

Note that since geocoders are not (currently) queried in parallel, a picker that requires lots of results to make a decision may take longer to return a value.

geocode

  my $options = {
    location        => $location,
    results_cache   => $cache,
  };

  my $found_location = $geocoder_multi->geocode( $options );

The arguments to the geocode method are:

  KEY                   VALUE
  -----------           --------------------
  location              location string to pass to geocoder
  results_cache         reference to a cache object, will over-ride the default
  no_cache              if set, the result will not be retrieved or set in cache (off by default)
  wait_for_retries      if set, the method will wait until it's sure all geocoders have been tried (off by default)

This method is the basis for the class, it will retrieve result from cache first, and return if cache hit.

If the cache is missed, the geocode method is called, with the location as the argument, on the next available geocoder object in the sequence.

If called in an array context all the matching results will be returned, otherwise the first result will be returned.

A matching address will have the following keys in the hash reference.

  KEY                   VALUE
  -----------           --------------------
  response_code         integer response code (see below)
  address               matched address
  latitude              latitude of matched address
  longitude             longitude of matched address
  country               country of matched address (not available for all
                        geocoders)
  geocoder              source used to lookup address
  location              the original query string
  precision             scalar from 0.0 to 1.0 denoting granularity of the
                        result (undef if not known)

The geocoder key will contain a string denoting which geocoder returned the results (eg, 'jingle').

The response_code key will contain the response code. The possible values are:

  200   Success 
  210   Success (from cache)
  401   Unable to find location
  402   All geocoder limits reached (not yet implemented)

INTERNAL METHODS

_form_response

Takes a result hash and a Response object and mashes them into a single flat hash, allowing results from different geocoders to be more easily assimilated

_lookup_callback

Given a name and a list of mappings from names to code references, do a fuzzy lookup of the name and return the appropriate subroutine.

_response_valid

Checks that a response is defined and has a valid response code,

_passes_filter

Check a response passes the filter callback (if one is set).

_get_geocoders

Returns a list of the geocoders that have been added to the Many geocoder.

_get_next_geocoder

Requests the next geocoder from the scheduler and looks it up in the geocoders hash.

_recalculate_geocoder_stats

Assigns weights to the current geocoders, and initialises the scheduler as appropriate.

_new_scheduler

Returns an instance of the currently-set scheduler, with the specified geocoders.

_set_caching_object

Set the list of cache objects

_test_cache_object

Test the cache to ensure it has 'get', 'set' and 'remove' methods

_set_in_cache

Store the result in the cache

_get_from_cache

Check the cache to see if the data is available

_normalize_location_string

Use the provided normalize_code_ref callback (if one is set) to return a normalized version of the given location string.

NOTES

All cache objects used must support 'get', 'set' and 'remove' methods.

The input (location) string is expected to be in utf-8. Incorrectly encoded strings will make for unreliable geocoding results. All strings returned will be in utf-8. returned latitude and longitude co-ordinates will be in WGS84 format.

In the case of an error, this module will print a warning and then may call die().

Geo::Coder Interface

The Geo::Coder::* modules added to the geocoding source list must have a geocode method which takes a single location string as an argument.

Currently supported Geo::Coder::* modules are:

  Geo::Coder::Bing
  Geo::Coder::Google
  Geo::Coder::Multimap
  Geo::Coder::OSM
  Geo::Coder::PlaceFinder
  Geo::Coder::Yahoo

SEE ALSO

  Geo::Coder::Bing
  Geo::Coder::Google
  Geo::Coder::Multimap
  Geo::Coder::OSM
  Geo::Coder::PlaceFinder
  Geo::Coder::Yahoo

AUTHOR

Alistair Francis, http://search.cpan.org/~friffin/

Dan Horgan

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10 or, at your option, any later version of Perl 5 you may have available.