NAME

Data::ID::Exim - generate Exim message IDs

SYNOPSIS

    use Data::ID::Exim qw(exim_mid exim_mid36);

    $mid = exim_mid;
    $mid = exim_mid36;

    use Data::ID::Exim qw(
        exim_mid_time exim_mid36_time read_exim_mid read_exim_mid36);

    $mid_time = exim_mid_time(Time::Unix::time());
    $mid_time = exim_mid36_time(Time::Unix::time());
    ($sec, $usec, $pid) = read_exim_mid($mid);
    ($sec, $usec, $pid) = read_exim_mid36($mid);

    use Data::ID::Exim qw(base62 base36 read_base62 read_base36);

    $digits = base62(3, $value);
    $digits = base36(3, $value);
    $value = read_base62($digits);
    $value = read_base36($digits);

DESCRIPTION

This module supplies functions which generate IDs using the algorithms that the Exim MTA uses to generate message IDs, and functions to manipulate such IDs. Exim has two schemes for message IDs, one using base 62 to compactly represent numeric components and one using base 36. Base 62 is the preferred system, and is used where filenames are case-sensitive. Base 36, which yields monocase (specifically uppercase) message IDs, is used where filenames are case-insensitive. Apart from the radix the two schemes are very similar. This module supplies separate functions for the two schemes.

FUNCTIONS

All of these functions come in matched pairs, for the base-62 and the base-36 message ID schemes. Each pair is described together, because the functions are used identically.

exim_mid
exim_mid36

Generates an Exim message ID. (This ID may, of course, be used to label things other than mail messages, but Exim refers to them as message IDs.) The ID is based on the time and process ID, such that it is guaranteed to be unique among IDs generated by this algorithm on this host. This function is completely interoperable with Exim, in the sense that it uses exactly the same algorithm so that the uniqueness guarantee applies between IDs generated by this function and by Exim itself.

The format of the message ID is three groups of base 62 or base 36 digits respectively, separated by hyphens. The first group, of six digits, gives the integral number of seconds since the epoch. The second group, also of six digits, gives the process ID. The third group, of two digits, gives the fractional part of the number of seconds since the epoch, in units of 1/2000 of a second (500 us) or 1/1000 of a second (1000 us) respectively. The function does not return until the clock has advanced far enough that another call would generate a different ID.

The strange structure of the ID comes from compatibility with earlier versions of Exim, in which the last two digits were a sequence number.

exim_mid(HOST_NUMBER)
exim_mid36(HOST_NUMBER)

Exim has limited support for making message IDs unique among a group of hosts. Each host is assigned a number in the range 0 to 16 or 11 respectively inclusive. The last two digits of the message IDs give the host number multiplied by 200 or 100 respectively plus the fractional part of the number of seconds since the epoch in units of 1/200 of a second (5 ms) or 1/100 of a second (10 ms) respectively. This makes message IDs unique across the group of hosts, at the expense of generation rate.

To generate this style of ID, pass the host number to exim_mid or exim_mid36. The host number must be configured by some out-of-band mechanism.

exim_mid_time(TIME)
exim_mid36_time(TIME)

Because the first section of an Exim message ID encodes the time to a resolution of a second, these IDs sort in a useful way. For the purposes of lexical comparison using this feature, it is sometimes useful to construct a string encoding a specified time in Exim message ID format. (This can also be used as a very concise clock display.)

This function constructs the initial time portion of an Exim message ID. TIME must be an integral Unix time number. The corresponding six-digit string is returned.

read_exim_mid(MID)
read_exim_mid36(MID)

This function extracts the information encoded in an Exim message ID. This is a slightly naughty thing to do: the ID should really only be used as a unique identifier. Nevertheless, the time encoded in an ID is sometimes useful.

The function returns a three-element list. The first two elements encode the time at which the ID was generated, as a (seconds, microseconds) pair giving the time since the epoch. This is the same time format as is returned by gettimeofday. The message ID does not encode the time with a resolution as fine as a microsecond; the returned microseconds value is rounded down appropriately. The third item in the result list is the encoded PID.

read_exim_mid(MID, HOST_NUMBER_P)
read_exim_mid36(MID, HOST_NUMBER_P)

The optional HOST_NUMBER_P argument is a truth value indicating whether the message ID was encoded using the variant algorithm that includes a host number in the ID. It is essential to decode the ID using the correct algorithm. The host number, if present, is returned as a fourth item in the result list.

base62(NDIGITS, VALUE)
base36(NDIGITS, VALUE)

These perform base 62 and base 36 encoding respectively. VALUE and NDIGITS must both be non-negative native integers. VALUE is expressed in base 62 or base 36 respectively, and the least significant NDIGITS digits are returned as a string.

read_base62(DIGITS)
read_base36(DIGITS)

These perform base 62 and base 36 decoding. DIGITS must be a string of base 62 or base 36 digits respectively. It is interpreted and the value returned as a native integer.

BUGS

Can theoretically generate duplicate message IDs during a leap second. Exim suffers the same problem.

SEE ALSO

Data::ID::Maildir, UUID, Win32::Guidgen, http://www.exim.org

AUTHOR

Andrew Main (Zefram) <zefram@fysh.org>

COPYRIGHT

Copyright (C) 2004, 2006, 2007, 2009, 2010, 2011, 2017 Andrew Main (Zefram) <zefram@fysh.org>

LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.