Geo::libpostal - Perl bindings for libpostal
use Geo::libpostal ':all'; # normalize an address my @addresses = expand_address('120 E 96th St New York'); # parse addresses into their components my %address = parse_address('The Book Club 100-106 Leonard St Shoreditch London EC2A 4RH, United Kingdom'); # %address contains: # ( # road => 'leonard st', # postcode => 'ec2a 4rh', # house => 'the book club', # house_number => '100-106', # suburb => 'shoreditch', # country => 'united kingdom', # city => 'london' # );
libpostal is a C library for parsing/normalizing international street addresses.
Address strings can be normalized using expand_address which returns a list of valid variations so you can check for duplicates in your dataset. It supports normalization in over 60 languages.
expand_address
An address string can also be parsed into its constituent parts using parse_address such as house name, number, city and postcode.
parse_address
use Geo::libpostal 'expand_address'; my @ny_addresses = expand_address('120 E 96th St New York'); my @fr_addresses = expand_address('Quatre vingt douze R. de l\'Église');
Takes an address string and returns a list of known variants. Useful for normalization. Accepts many boolean options:
expand_address('120 E 96th St New York', latin_ascii => 1, transliterate => 1, strip_accents => 1, decompose => 1, lowercase => 1, trim_string => 1, drop_parentheticals => 1, replace_numeric_hyphens => 1, delete_numeric_hyphens => 1, split_alpha_from_numeric => 1, replace_word_hyphens => 1, delete_word_hyphens => 1, delete_final_periods => 1, delete_acronym_periods => 1, drop_english_possessives => 1, delete_apostrophes => 1, expand_numex => 1, roman_numerals => 1, );
Warning: libpostal segfaults if all options are set to false.
Also accepts an arrayref of language codes per ISO 639-1:
expand_address('120 E 96th St New York', languages => [qw(en fr)]);
This is useful if you are normalizing addresses in multiple languages.
Will die on undef and empty addresses, odd numbers of options and unrecognized options. Exported on request.
die
undef
use Geo::libpostal 'parse_address'; my %ny_address = parse_address('120 E 96th St New York'); my %fr_address = parse_address('Quatre vingt douze R. de l\'Église');
################################################# # options are ignored by libpostal # https://github.com/openvenues/libpostal/blob/e816b4f77e8c6a7f35207ca77282ffab3712c5b6/src/address_parser.c#L837 # ############################################## # Takes an address string and parses it, returning a list of labels and values. # Accepts two optional named parameters: # # =over 4 # # =item * # # language - 2 character language code per ISO 639-1 # # =item * # # country - 2 character country code per ISO 3166-1 alpha-2 # # =back # # Currently these are ignored by libpostal! # # Will die on undef and empty addresses, odd numbers of options and # unrecognized options. Exported on request.
language
country
Will die on undef and empty addresses. Exported on request.
parse_address() may return duplicate labels for invalid addresses strings.
parse_address()
libpostal uses setup() and teardown() functions. Setup is lazily loaded. Teardown occurs in an END block automatically. Geo::libpostal will die if expand_address or parse_address is called after teardown.
setup()
teardown()
END
Geo::libpostal
libpostal is required.
You can install this module with CPAN:
$ cpan Geo::libpostal
Or clone it from GitHub and install it manually:
$ git clone https://github.com/dnmfarrell/Geo-libpostal $ cd Geo-libpostal $ perl Makefile.PL $ make $ make test $ make install
© 2016 David Farrell
See LICENSE
To install Geo::libpostal, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Geo::libpostal
CPAN shell
perl -MCPAN -e shell install Geo::libpostal
For more information on module installation, please visit the detailed CPAN module installation guide.