The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Geo::Postcodes::JP::Process - process Japan Post Office postcode data

read_ken_all

    my $postcodes_ref = read_ken_all ('KEN_ALL.CSV');

Read the file KEN_ALL.CSV. The return value is an array reference containing the lines of the postcode file in the same order as the file itself. The routine issues a fatal error if a problem is encountered.

The return value is a double indexed array.

process_line

    my %values = process_line ($line);

Turn a line of the postcode file into a hash of its values.

The values of the hash are

number

The JIS code number for the region. The JIS standards for regions of Japan are numbered JIS X 0401 (1973) for the prefecture identification codes, and JIS X0402 (2003) identification codes for cities, towns and villages.

old_postcode

The old three or five digit postcode.

new_postcode

The new seven digit postcode.

ken_kana

The kana version of the prefecture.

city_kana

The kana version of the city.

address_kana

The kana version of the address.

ken_kanji

The kanji version of the prefecture.

city_kanji

The kanji version of the city.

address_kanji

The kanji version of the address.

one-region-multiple-postcodes

This is 1 if the same address has more than one postcode, zero otherwise.

numbering-start

Indicates if numbering starts, 1 if so.

has-choume

Indicates there is a division into "choume".

one-postcode-multiple-regions

This is 1 if the same postcode covers more than one region, zero otherwise.

koushin-no-hyouji

0 = no change, 1 = change, 2 = delete

henkou-riyuu

Reason for change.

See also the Japan Post explanation of the KEN_ALL.CSV file in Japanese.

concatenate_multi_line

    $postcodes = concatenate_multi_line ($postcodes, $duplicates);

Concatenate a single entry which is spread on multiple lines. $Duplicates is the return value of find_duplicates.

If you are wondering what "concatenate a single entry which is spread on multiple lines" means, some of the entries in the CSV file are actually single entries but broken into two or more lines if the number of characters in one of the fields exceeds a maximum. This routine attempts to put this broken data back together again.

At the moment there is no comprehensive check of correctness of the result.

find_duplicates

    my $duplicates = find_duplicates ($postcodes);

Make a hash whose keys are postcodes which have duplicate references, and whose values are array references to arrays of offsets in the postcode file. The return value is the hash reference.

read_jigyosyo

    my $jigyosyo_data = read_jigyosyo ('/path/to/jigyosyo/csv/file');

process_jigyosyo_line

    my %values = process_jigyosyo_line ($line);

Turn the array reference $line into a hash of its values using the fields.

The values of the hash are

number

As for the main postcode file.

kana

The name of the place of business in kana.

kanji

The name of the place of business in kanji.

ken_kanji

The kanji version of the prefecture name.

city_kanji

The kanji version of the city name.

address_kanji

The kanji version of the address name.

street_number

The exact street number of the place of business.

new_postcode

As for the "ken_all" fields.

old_postcode

As for the "ken_all" fields.

post-office

The post office which handles mail for this postcode.

type

0=Large company 1=Private

multiple-postcode

0=Not multiple, also 1,2,3.

Alteration code

0=No change 1=New addition 2=Deleted

See also the Japan Post explanation of the JIGYOSYO.CSV file in Japanese.

remove_bad_addresses

    $postcodes = remove_bad_addresses ($postcodes);

improve_postcodes

    $postcodes = improve_postcodes ($postcodes);

Improve the postcodes as much as possible by unifying lines etc.

TERMINOLOGY

Postcode

In this module, "postcode" is the translation used for the Japanese term "yuubin bangou" (郵便番号). They might be called "postal codes" or even "zip codes" by some.

This module only deals with the seven-digit modern postcodes introduced in 1998. It does not handle the three and five digit postcodes which were used until 1998.

Ken

In this module, "ken" in a variable name means the Japanese system of prefectures, which includes the "ken" divisions as well as the "do/fu/to" divisions, with "do" used for Hokkaido, "fu" for Osaka and Kyoto, and "to" for the Tokyo metropolis. These are got from the module using the word "ken".

See also the sci.lang.japan FAQ on Japanese addresses.

City

In this module, "city" is the term used to point to the second field in the postcode data file. Some of these are actually cities, like "Mito-shi" (水戸市), the city of Mito in Ibaraki prefecture. However, some of them are not really cities but other geographical subdivisions, such as gun/machi or shi/ku combinations.

Address

In this module, "address" is the term used to point to the third field in the postcode data file. This is called 町域 (chouiki) by the Post Office.

For example, in the following data file entry, "3100004" is the postcode, "茨城県" (Ibaraki-ken) is the "ken", "水戸市" (Mito-shi) is the "city", and "青柳町" (Aoyagicho) is the "address".

    08201,"310  ","3100004","イバラキケン","ミトシ","アオヤギチョウ","茨城県","水戸市","青柳町",0,0,0,0,0,0
Jigyosyo

In this module, "jigyosyo" is the term used to point to places of business. Some places of business have their own postcodes.

The term "jigyosyo" is used because it is the post office's own romanization, but this is actually an error and should be either jigyōsho or zigyôsyo in standard romanizations of Japanese, or jigyosho in simplified Hepburn. See the Sci.Lang.Japan FAQ page on Japanese romanization.

Street number

In this module "street number" is an arbitrary way of describing the final part of the address, which may actually specify a variety of things, such as the ban-chi, or even what floor of a building the postcode refers to.

The street number field is mostly relevant for the jigyosyo postcodes, but also crops up in some of the addresses, especially for rural areas.

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT AND LICENSE

Geo::Postcodes::JP and associated files are copyright (c) 2012 Ben Bullock.

You may use, copy, modify and distribute Geo::Postcodes::JP under the same terms as the Perl programming language itself.