The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

# NAME

Data::Dimensions - Strongly type values with physical units

# SYNOPSIS

``````  use Data::Dimensions qw(extended &units);

my \$energy = Data::Dimensions->new( {joule => 1} );
# or, more simply...
my \$mass   = units( {kg =>1 } );
my \$c      = units( {m=>1, s=>-1} );

\$mass->set = 10; \$c->set = 299_792_458;

# checks that units of mc^2 same as energy, use indirect syntax...
set \$energy = \$mass * \$c**2;

# made a mistake on right, so dies with error
set \$energy = \$mass * \$c**3;``````

# DESCRIPTION

## Careful with that Equation, Eugene

In many applications type checking will make code more robust as algorithmic (rather than syntax) errors can be found automatically. Most languages which implement a type system (eg. C) only go as far as giving each variable or function a single type property (such as `int frobnicate(int x, float y)`) which can be a user defined type (a C `typedef`). This system is useful but falls short of the typing needed in many applications, for instance it cannot catch the following error (again, in C):

`````` PENCE_PER_GALLON unit_price;
VOLUME           volume;
PENCE            price;

price = volume / unit_price;``````

Instead we want unit_price to have a type of pence per gallon, volume a type of gallons and price a type of pence. We also want these types to propogate through expressions so that the resulting type of `volume / unit_price` is

`` gallons / ( pence / gallons ) == gallons ** 2 / pence``

which is clearly not of the same type as price which we can detect and therefore issue an appropriate error message.

Many scientific applications also require strong typing of this form, for instance the famous equation `E == M * C**2` is such that the type (or units) of Energy (Joule) is identical to the units of Mass (kg) times the units of the speed of light (m/s) squared, this provides an indication that the equation is correct, and if we were to use it as part of a calculation in a program, we can use the units of the quantities to ensure that we have entered our program correctly.

It is also important to note that in many cases two quantities will have different units but are used to measure the same underlying property of something. For instance, the metric meter and the Imperial foot both measure the length of an object. As an example, the volume of wood in a thin plank could be calculated given:

`````` \$length in yards
\$width  in feet
\$depth  in inches
\$volume in cubic feet``````

We could calculate our volume by carefully converting all the measurements to have the same units (inches, say) but this introduces large amounts of code into our application which isn't crucial to the problem we are attempting to solve (and that's a bad thing, remember). Instead if our variables are all typed, we can get them to perform automatic conversion between different units, so that

`` \$volume = \$length * \$width * \$depth;``

is all we need to say.

## Typing to the Rescue

This module allows you to type your values with units. These values can then be used throughout your program and will automatically convert themselves sensibly between measurement systems and ensure that they are only used appropriately. A range of popular units are provided along with this module, and the interface needed to add your own units is simple (and documented).

# Introducing Types to your Program

## Creating typed values

A typed value is created in the same way as any other object in perl, using the `new` method of the Data::Dimensions class or by importing the `units` subroutine when loading the module. The units of the value should be expressed as a hash reference giving unit => exponent pairs:

`` \$distance = Data::Dimensions->new( {meter => 1 } );``

Optionally, an initial value (in the natural units of the variable) can be assigned as a second argument:

`````` \$speed  = units( {miles => 1, hour => -1}, 70 );
\$time   = units( {hour => 1}, 2 );``````

## Assignment to a typed value

The typed values can then be used as you would any other variable, with the exception that assignment to a typed variable must be done through the variables `set` method. (I find it slightly nicer to use the indirect object syntax shown below.)

`` set \$distance = \$speed * \$time;``

(Due to a bug in the tie mechanism of perl, it is not possible to allow the simpler:

`` \$distance = \$speed * \$time;``

this will set \$distance, but will not check that a value with correct units is being stored in it.)

You can also use the `set` method to give a value to a variable:

`` set \$speed = 60; #  "rollers"``

This expects a value expressed in the natural units of the variable (in this case miles per hour), if a typed value is stored then the base units of the stored value must match those of the variable in which it is being stored, the natural units can be different and if so any necessary scaling is performed automatically.

The following is valid (if a little contrived):

`````` \$length = units( {kilo-meter => 1}, 12 );
\$width  = units( {mile => 1}, 3 );
\$area   = units( {acre => 1});

set \$area = \$length * \$width;``````

When variables are output (converted into numbers, printed, compared with untyped values etc.), they are always treated as being in their natural units, so that:

`` print \$area, "\n";``

will output \$area in acres.

## Mixing incompatible types

The major point of this module is to detect errors in expressions, this means, for instance, that both operands of a '+' operator must have the same basic units, it is ok to add distance to distance, but nonsense to add volume to speed. If incorrect units are present in an expression, the module will die with an appropriate error message.

For arithmetic and comparison operations, any untyped values are assumed to have the same type as the typed operand, and be expressed in its natural units, so

`` \$length = \$old_length + 12;``

will effectively upgrade 12 to be the same type as \$old_length, also saying:

`` if (\$length == 15) ...``

will work as expected.

# Dimensions and Measurement systems provided

## Prefixes

All units can carry standard prefixes to indicate appropriate powers of ten, kilometers are specified with "k-m" or "kilo-m". The following prefixes are available:

`````` semi- demi-    0.5
Y- yotta-      1e24
Z- zetta-      1e21
E- exa-        1e18
P- peta-       1e15
T- tera-       1e12
G- giga-       1e9
M- mega-       1e6
k- kilo-       1e3
h- hecto-      1e2
da- deka-      1e1
d- deci-       1e-1
c- centi-      1e-2
m- milli-      1e-3
u- micro-      1e-6
n- nano-       1e-9
p- pico-       1e-12
f- femto-      1e-15
a- atto-       1e-18
z- zopto-      1e-21
y- yocto-      1e-24``````

Prefixes are stripped off the unit before any user-defined handlers are run.

## SI Units

SI units are generally used as the base units for almost all measurements (with the exception of monetary units, which lack a common base due to exchange rate fluctuations). The following units are those the module most likes to see:

`````` m   - meter, length
kg  - kilogram, mass
s   - second, time
A   - Ampere, electrical current
K   - Kelvin, temperature
mol - mole, amount of substance
cd  - candela, luminous intensity

sr  - sterradian, measure of solid angle``````

In addition to these, the following are defined and map appropriately to their actual units, apart from one or two letter units, all should be specified in lower case:

`````` meter, kilogram, sec, second, amp, ampere, kelvin, candela, mole,
coulomb, seimens farad, weber, henry, tesla, lumen, becquerel, gray,
Hz, N, Pa, J, W, coul, V, ohm, S, F, Wb, H, T, lm, Bq, Gy.``````

## Other Units

The following units are also provided, if you wish to use these you must specify the `extended` option when loading the module.

`````` lb nmile centrigrade foot electronvolt baud brpint arcsec ft arcmin
deg hr liter inch yr week gram cc day min feet minute year brquart
hour brgallon mile fermi tonne micron lightyear siderealyear cal gm
gallon acre erg revolution ounce degree parsec in point block celsius
barn gal ml byte turn mho quart pint amu arcdeg pound yard yd oz
angstrom``````

Any units given to the module which it does not understand are simply left in place, if all you're doing is measuring chickens, then just use a chicken unit throughout your code.

## Base units, natural units and scaling

To allow for conversion between different measurement systems, it is necessary to chose one which is better than all the others, one which forms the base on which all other measurement systems rest and in whose units all other units can be given. In scientific systems this will be SI, where the Joule can be expressed as

`` Joule == (kg=>1, m=>2, s=>-2)``

and the eV (electron volt, used in particle physics) can be expressed as:

`` 1 electron Volt == 1.6E-19 Joule == 1.6E-19 kg=>1, m=>2, s=>-2``

Here we say that kg, m and s are the base units of Joules and electron volts, and that the scaling between Joule and its base form is 1, and that between eV and its base form is 1.6E-19. More generally,

`` 1 natural unit == scaling factor * base unit``

Any unit system you want to introduce to your program must be able to take a set of natural units, convert these into suitable base units (which may or may not be SI depending on your application) and calculate the scaling factor which must be used when converting values from the base unit to the natural unit.

## How to do this

You simply need to write a subroutine which takes as its arguments a reference to a hash of natural units and the current scaling factor. It must return a new hash reference with units defined in appropriate basic units, and a scaling factor. Any units your routine does not understand should be left alone as it is possible to chain these subroutines. Your subroutine will always be called before those provided by the module, so it is ok to return nearly basic units like 'Joule'.

You specify the routine when you load Data::Dimensions. The following example allows electron Volts to be used:

``````    use Data::Dimensions qw(&units extended);
Data::Dimensions->push_handler(\&myunits);

sub myunits {
my (\$natural, \$scale) = @_;
my %temp;
foreach (keys %\$natural) {
if (/^(ev|electronvolts?)\$/i) {
\$scale       *= 1.6E-19**(\$natural->{\$_});
\$temp{joule} += \$natural->{\$_};
}
else {
\$temp{\$_}    += \$natural->{\$_};
}
}
return (\%temp, \$scale);
};``````

You can also add units to the %Data::Dimensions::Map::units hash and loading the module with the "extended" option, this contains entries with the following structure:

``    \$units{unit} = [scale, { basic units }  ];``

eg. \$units{inch} = [2.54 / 100, {m=>1} ];

# Debugging Hooks

If you're getting confused, you can call \$variable->_dump to get a pretty output of the underlying structure.

# Future plans

It would be nice to get this working with Attributes and to rid the module of the `set` evil. More units would be helpful as would documentation that isn't confusing. Also add constants with units, trivial but boring.

# BUGS

The \$foo->set is annoying, but needed as there seems to be a bug in perl's overloaded and tie mechanism.

If you discover any bugs in this module, or have features you would like added, please report them via the CPAN Request Tracker at rt.cpan.org. Any other comments are welcome and should be sent directly to the author.

# AUTHOR

Alex Gough (alex@earth.li) -- Do get in touch, it will make me smile.

This module is copyright (c) Alex Gough 2001-2002. This module is free software, you may use and redistribute it under the same terms as Perl itself.