Geo::StreetAddress::Canada - Perl extension for parsing Canadian street addresses
use Geo::StreetAddress::Canada; $hashref = Geo::StreetAddress::Canada->parse_location( "151 Front Street West, Toronto, Ontario M1M 1M1" ); $hashref = Geo::StreetAddress::Canada->parse_location( "Front & York, Toronto, Ontario" ); $hashref = Geo::StreetAddress::Canada->parse_address( "151 Front Street West, Toronto, Ontario" ); $hashref = Geo::StreetAddress::Canada->parse_informal_address( "Lot 3 York Street" ); $hashref = Geo::StreetAddress::Canada->parse_intersection( "Spadina Avenue at Bremner Boulevard, Toronto, Ontario" ); $hashref = Geo::StreetAddress::Canada->normalize_address( \%spec ); # the parse_* methods call this automatically...
Geo::StreetAddress::Canada is a regex-based street address and street intersection parser for Canada. Its basic goal is to be as forgiving as possible when parsing user-provided address strings. Geo::StreetAddress::Canada knows about directional prefixes and suffixes, fractional building numbers, building units, grid-based addresses, postal codes, and all of the official Canada Post abbreviations for street types, province names and secondary unit designators. Please note that this extension will only return data in English. If you are looking for French language support, Please see Geo::StreetAddress::FR(3pm). Patches are welcome if someone wishes to combine the two!
Most Geo::StreetAddress::Canada methods return a reference to a hash containing address or intersection information. This "address specifier" hash may contain any of the following fields for a given address. If a given field is not present in the address, the corresponding key will be set to undef in the hash.
undef
Future versions of this module may add extra fields.
House or street number.
Directional prefix for the street, such as N, NE, E, etc. A given prefix should be one to two characters long.
Name of the street, without directional or type qualifiers.
Abbreviated street type, e.g. Rd, St, Ave, etc. See the Canada Post Addressing Guidelines at http://www.canadapost.ca/tools/pg/manual/PGaddress-e.asp#1423617 for a list of abbreviations used.
Directional suffix for the street, as above.
Name of the city, town, or other locale that the address is situated in.
The province which the address is situated in, given as its two-letter postal abbreviation. for a list of abbreviations used.
Postal code for the address, with a space separating the FSA and LDU. IE: M1M 1M1.
If the address includes a Secondary Unit Designator, such as a room, suite or appartment, the sec_unit_type field will indicate the type of unit.
sec_unit_type
If the address includes a Secondary Unit Designator, such as a room, suite or apartment, the sec_unit_num field will indicate the number of the unit (which may not be numeric).
sec_unit_num
Directional prefixes for the streets in question.
Names of the streets in question.
Street types for the streets in question.
Directional suffixes for the streets in question.
City or locale containing the intersection, as above.
Province abbreviation, as above.
Postal code for address, as above.
Geo::StreetAddress::Canada contains a number of global variables which it uses to recognize different bits of Canadian street addresses. Although you will probably not need them, they are documented here for completeness's sake.
Maps directional names (north, northeast, etc.) to abbreviations (N, NE, etc.).
Maps directional abbreviations to directional names.
Maps English lowercase Canada Post standard street types to their canonical postal abbreviations.
Maps lowercased Canadian Province or territory names to their canonical two-letter postal abbreviations.
A hash of compiled regular expressions corresponding to different types of address or address portions. Defined regexen include type, number, fraction, state, direct(ion), dircode, zip, corner, street, place, address, and intersection.
Direct use of these patterns is not recommended because they may change in subtle ways between releases.
If true then "normalize_address" will set the type field to undef if the street field contains a word that corresponds to the type in \%Street_Type.
type
street
For example, given "4321 Country Road 7", street will be "Country Road 7" and type will be "Rd". With avoid_redundant_street_type set true, type will be undef because street matches /\b (rd|road) \b/ix;
Also applies to type1 for street1 and type2 for street2 fields for intersections.
type1
street1
type2
street2
The default is false, for backwards compatibility.
# Add another street type mapping: $Geo::StreetAddress::Canada::Street_Type{'cur'}='curv'; # Re-initialize to pick up the change Geo::StreetAddress::Canada::init();
Runs the setup on globals. This is run automatically when the module is loaded, but if you subsequently change the globals, you should run it again.
$spec = Geo::StreetAddress::Canada->parse_location( $string )
Parses any address or intersection string and returns the appropriate specifier. If $string matches the $Addr_Match{corner} pattern then parse_intersection() is used. Else parse_address() is called and if that returns false then parse_informal_address() is called.
$spec = Geo::StreetAddress::Canada->parse_address( $address_string )
Parses a street address into an address specifier using the $Addr_Match{address} pattern. Returning undef if the address cannot be parsed as a complete formal address.
You may want to use parse_location() instead.
$spec = Geo::StreetAddress::Canada->parse_informal_address( $address_string )
Acts like parse_address() except that it handles a wider range of address formats because it uses the "informal_address" pattern. That means a unit can come first, a street number is optional, and the city and state aren't needed. Which means that informal addresses like "#42 123 Main St" can be parsed.
Returns undef if the address cannot be parsed.
$spec = Geo::StreetAddress::Canada->parse_intersection( $intersection_string )
Parses an intersection string into an intersection specifier, returning undef if the address cannot be parsed. You probably want to use parse_location() instead.
$spec = Geo::StreetAddress::Canada->normalize_address( $spec )
Takes an address or intersection specifier, and normalizes its components, stripping out all leading and trailing whitespace and punctuation, and substituting official abbreviations for prefix, suffix, type, and state values. Also, city names that are prefixed with a directional abbreviation (e.g. N, NE, etc.) have the abbreviation expanded. The original specifier ref is returned.
Typically, you won't need to use this method, as the parse_*() methods call it for you.
parse_*()
Geo::StreetAddress::Canada might not correctly parse house numbers that contain hyphens.
This software was originally part of Geo::StreetAddress::US (q.v.) but was split apart into an independent module for your convenience. Therefore it has some behaviors which were designed for Geo::StreetAddress::US, but which may not be right for your purposes. If this turns out to be the case, please let me know.
Geo::StreetAddress::Canada does NOT perform Canada Post certified address normalization.
French addresses are not supported. This extension will only output data in English. If you require support for French addresses, please see Geo::StreetAddress::FR(3pm). Patches are welcome to combine the two!
This software was originally part of Geo::StreetAddress::US(3pm).
Lingua::EN::AddressParse(3pm) and Geo::PostalAddress(3pm) both do something very similar to Geo::StreetAddress::Canada, but are either too strict/limited in their address parsing, or not really specific enough in how they break down addresses (for my purposes).
Canada Post Addressing Guidelines: http://www.canadapost.ca/tools/pg/manual/PGaddress-e.asp
Thanks to Schuyler D. Erle <schuyler@geocoder.us>, the author of Geo::StreetAddress:US, for providing a very solid base upon which to build an extension tailored for Canadian use.
Scott Burlovich <lt>teedot@gmail.com>
Copyright (C) 2013 by Scott Burlovich.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
To install Geo::StreetAddress::Canada, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Geo::StreetAddress::Canada
CPAN shell
perl -MCPAN -e shell install Geo::StreetAddress::Canada
For more information on module installation, please visit the detailed CPAN module installation guide.