The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Time::OlsonTZ::Clustered - Olson time zone clusters based on similar offset and DST changes

VERSION

version 0.002

SYNOPSIS

    use Time::OlsonTZ::Clustered ':all';

    say $_->timezone_name for @{ primary_zones('US') };
    # Pacific/Honolulu
    # America/Adak
    # America/Anchorage
    # America/Los_Angeles
    # America/Metlakatla
    # America/Denver
    # America/Phoenix
    # America/Chicago
    # America/New_York

    say find_primary("America/Indiana/Indianapolis");
    # America/New_York

DESCRIPTION

There are over 400 Olson time zone names (e.g. "America/New_York") describing current and historical offset and daylight-savings behavior. While this is essential for accurate calculations involving times in the past, it is an overwhelming list to present as part of a user experience current behavior is relevant. (E.g. "Choose your time zone")

For example, China has had only one official time (UTC+8) since 1949, but there are five Olson time zones corresponding to historical districts:

    Asia/Shanghai
    Asia/Chongqing
    Asia/Kashgar
    Asia/Urumqi
    Asia/Harbin

When presenting a list of time zone choices, there are many situations in which it is sufficient to present "Asia/Shanghai" for China. Likewise, the United States has consolidated some of its historically fragmented time zone observance. Instead of asking someone if they are in "America/Indiana/Indianapolis", it is sufficient to ask them to pick "America/New_York" (a.k.a. "US Eastern Time").

This module provides a pre-calculated clustering the 400+ Olson time zones by country and by time observance behavior, allowing a consolidated list of "primary zones" to be offered for each country.

The clustering was developed using the following heuristics and modifications:

  • For each country, cluster time zones by observance behavior

  • Zones cluster together if they have the same UTC offset at local noon for the next 365 days

  • If a cluster contains multiple zones, the author selected a primary zone using research and judgment

  • Multiple-zone clusters were given a descriptive name

Cluster descriptions were based on either the primary zone description (the Olson 'region_description' field) -- e.g. "McMurdo Station, Ross Island" for the cluster containing "Antartica/McMurdo" and "Antarctica/South_Pole" -- or else a subjectively-determined, broadly descriptive term common across the zones -- e.g. "Central time" for various US zones with a UTC-6 offset.

When a country had a single cluster, the cluster description was left blank, similar to how the Olson database leaves the time zone description blank. (N.B. the primary_zones() function returns the country name when there is no cluster description.)

Some additional modifications were made to account for errors in the Olson time zone files.

Cluster names were made on a best-efforts basis by the author. If you have suggested improvements, please file a bug report with your ideas.

The clustering will be updated over time as the Olson time zone database changes.

FUNCTIONS

primary_zones

    my $zones = primary_zones('US');

Takes a country code and returns a reference to an array of hash references. Each element in the array represents one timezone cluster in the country, sorted by UTC offset. The hash reference contains the following keys:

  • description: Description of the zone or the Olson country name if there is only one cluster

  • offset: UTC offset, expressed in hours ('+5', '-2')

  • timezone_name: the primary Olson zone name for the cluster ('America/Chicago')

For example, here are some of the items returned from primary_zones('AQ'):

    [
        {
            'description'   => 'Palmer Station, Anvers Island',
            'offset'        => -4,
            'timezone_name' => 'Antarctica/Palmer'
        },
        {
            'description'   => 'Rothera Station, Adelaide Island',
            'offset'        => -3,
            'timezone_name' => 'Antarctica/Rothera'
        },
        ...
    ]

The description may be the Olson description of the primary zone or it may be a custom alternative that the author feels best describes the cluster.

Offsets given are the smallest offset observed during the year. This should correspond to the non-daylight savings time offset in zones that observe daylight savings time for part of the year.

The primary zone is the best guess at the most common or recognizable Olson name in the cluster; see the "DESCRIPTION" section for details.

If an invalid country code is given, the function returns an empty array reference.

find_primary

    my $primary = find_primary("America/Indiana/Indianapolis");
    # returns "America/New_York"

Given an Olson time zone name, returns the primary zone name for the cluster containing the zone. Returns undef or the empty list if the zone is not recognized.

is_primary

    if ( is_primary("America/Chicago") ) { ... }

A boolean function to check if a time zone is primary for its cluster.

country_codes

    for my $cc ( country_codes() ) {
        ...
    }

Returns a sorted list of known country codes in the cluster database.

country_name

    my $name = country_name("US");

Returns the Olson country name (or an empty string) for a given country code. This duplicates information available elsewhere and is provided here for convenience.

timezone_clusters

    for my $cluster ( @{ timezone_clusters('US') } ) {
        ...
    }

Given a country code, returns an array reference of raw cluster data for the country (or an empty array reference if the country code is not found).

Each cluster is a hash reference with description and zones. The zones entry is an array reference of time zone hashes similar to that returned by primary_zones, but with olson_description containing the the original Olson description rather than description for the cluster. Note that for single-cluster countries, the cluster description will be blank.

For example, timezone_clusters("US") will return a data structure like this:

    [
        {
            'description' => 'Hawaii',
            'zones'       => [
                {
                    'offset'            => -10,
                    'olson_description' => 'Hawaii',
                    'timezone_name'     => 'Pacific/Honolulu'
                }
            ]
        },
        {
            'description' => 'Aleutian Islands',
            'zones'       => [
                {
                    'offset'            => -10,
                    'olson_description' => 'Aleutian Islands',
                    'timezone_name'     => 'America/Adak'
                }
            ]
        },
        {
            'description' => 'Alaska Time',
            'zones'       => [
                {
                    'offset'            => '-9',
                    'olson_description' => 'Alaska Time',
                    'timezone_name'     => 'America/Anchorage'
                },
                {
                    'offset'            => -9,
                    'olson_description' => 'Alaska Time - west Alaska',
                    'timezone_name'     => 'America/Nome'
                },
                {
                    'offset'            => '-9',
                    'olson_description' => 'Alaska Time - Alaska panhandle',
                    'timezone_name'     => 'America/Juneau'
                },
                {
                    'offset'            => '-9',
                    'olson_description' => 'Alaska Time - Alaska panhandle neck',
                    'timezone_name'     => 'America/Yakutat'
                },
                {
                    'offset'            => '-9',
                    'olson_description' => 'Alaska Time - southeast Alaska panhandle',
                    'timezone_name'     => 'America/Sitka'
                },
            ]
        },
        ...
    ]

find_cluster

    my $cluster = find_cluster("America/Indiana/Indianapolis");

Given an Olson time zone name, returns the cluster data structure containing the time zone. It returns undef or the empty list if the time zone name is not recognized.

USAGE

All functions are optionally exported using Sub::Exporter.

SEE ALSO

ACKNOWLEDGMENTS

The author would like to thank the following people for their help:

  • Andrew Main (ZEFRAM) for his time zone modules and advice on zone clustering heuristics.

  • Breno G. de Oliveira (GARU) for his patient explanations and advice regarding Brazilian time zones.

Any errors are solely those of the author (or the upstream Olson database).

SUPPORT

Bugs / Feature Requests

Please report any bugs or feature requests through the issue tracker at https://github.com/dagolden/time-olsontz-clustered/issues. You will be notified automatically of any progress on your issue.

Source Code

This is open source software. The code repository is available for public review and contribution under the terms of the license.

https://github.com/dagolden/time-olsontz-clustered

  git clone git://github.com/dagolden/time-olsontz-clustered.git

AUTHOR

David Golden <dagolden@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2012 by David Golden.

This is free software, licensed under:

  The Apache License, Version 2.0, January 2004