The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Array::DeepUtils - utilities for the manipulation of nested arrays

VERSION

This document refers to version 0.1 of Array::DeepUtils

SYNOPSIS

    use Array::DeepUtils qw/:all/;

    binary(
        [1,2,3,4,5,6,7,8],
        [[1,1][2,2][3,3][4,4]],
        sub { $_[0] + $_[1] }
    );

yields:

    [
      [    2,     3  ],
      [    5,     6  ],
      [    8,     9  ],
      [   11,    12  ],
    ]

A more complex example:

  my $x = [1..9];

  my $y = reshape($x, [3,3,3,3], $x);

$y is now:

  [
   [
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
   ],
   [
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
   ],
   [
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
    [[1,2,3],[4,5,6],[7,8,9]],
   ]
  ];


  my $z = dcopy($y, [[1,1,1,1],[2,2,2,2]]);

$z is now:

  [
   [
    [[5,6],[8,9]],
    [[5,6],[8,9]],
   ],
   [
    [[5,6],[8,9]],
    [[5,6],[8,9]],
   ]
  ];

  my $c = reshape([], [2,2], collapse($z));

resulting in $c being:

  [[5,6],[8,9]]

DESCRIPTION

This module is a collection of subroutines for the manipulation of deeply nested arrays. It provides routines for iterating along coordinates and for setting, retrieving and deleting values. The functions binary and unary are provided for applying arbitrary operators as code references to deeply nested arrays. With shape() and reshape() there are methods to determine and change the dimensions.

By default nothing is exported. The subroutines can be imported all at once via the ':all' tag.

Subroutine short description

"binary()" - appply a binary operator between two nested arrays

"collapse()" - flatten a nested array to a one dimensional vector

"dcopy()" - extract part of a nested array between two vectors

"idx()" - build an index vector for values of another vector

"purge()" - remove elements by value from a nested array

"remove()" - remove elements by index

"reshape()" - transform nested array by dimension vector

"rotate()" - rotate a data structure along its axes

"scatter()" - build a new data structure with data and index vector.

"shape()" - get nested array dimension vector

"subscript()" - extract nested array values by index vector

"transpose()" - transpose a nested array

"unary()" - appply a unary operator to all values of a nested array

"value_by_path()" - extract nested array values by coordinate vector

"vector_iterator()" - creates a subroutine for iterating between two coordinates

SUBROUTINES

binary()

binary($aref1, $aref2, $subref, $neutral_element [, $object, $fill_aref])

Recursively apply a binary operator represented by a subroutine reference to all elements of two nested data structures given in $aref1 and $aref2 and set the resulting values in $aref2. $aref2 will also be returned.

If these structures differ in shape they will be reshaped according to the larger structure. The value of $neutral_element will be used if one of the operands is undefined or does not exist ($neutral_element can also be a subroutine reference; it will be called on value retrieval and given $aref1 respectively $aref2 as only parameter). To be able to use methods as subroutines $object will be passed to the subroutine as first parameter when specified. Since binary() calls reshape() a given $fill_aref will be passed as the third parameter to reshape().

A simple example, after:

 my $v1   = [1,2,3];
 my $v2   = [9,8,7];
 my $func = sub { $_[0] * $_[1] }
 binary($v1, $v2, $func);

$v2 will have a value of

 [9, 16, 21]

Making it a bit more complicated:

 my $v1   = [1,2,3,4,5,6];
 my $v2   = [9,8,7];
 my $func = sub { $_[0] * $_[1] }
 binary($v1, $v2, $func);

results in:

 [9,16,21,36,40,42]

because missing values will be filled with the flattened structure repeated as often as it is needed, so the above is exactly the same as:

 my $v1   = [1,2,3,4,5,6];
 my $v2   = [9,8,7,9,8,7];
 my $func = sub { $_[0] * $_[1] }
 binary($v1, $v2, $func);

Using the fill parameter gives the opportunity to assign the values used for filling. It will also be repeated when necessary.

 my $v1   = [1,2,3,4,5,6];
 my $v2   = [9,8,7];
 my $fill = [1,2];
 my $func = sub { $_[0] * $_[1] };
 binary($v1, $v2, $func, 1, undef, $fill);

results in:

 [9,16,21,4,10,6];

because $v2 will have been reshaped to [9,8,7,1,2,1] before the multiplication.

This works for vectors of arbitrary depth, so that:

 my $v1   = [[1,2,3], [4,5,6], [7,8,9]];
 my $v2   = [[11,12], [13,14]];
 my $fill = [1, -1];
 my $func = sub { $_[0] * $_[1] };
 binary($v1, $v2, $func, 1, undef, $fill);

yields:

 [[11,24,3], [52,70,-6], [7,-8,9]]

collapse()

collapse($aref1)

Collapse the referenced array of arrays of arbitrary depth, i.e flatten it to a simple array and return a reference to it.

Example:

 collapse([[1,2,3],4,[5,[6,7,8,[9,0]]]]);

will return:

 [1,2,3,4,5,6,7,8,9,0]

dcopy()

dcopy($aref, $coord_aref)

Extract a part of an deeply nested array between two vectors given in the array referenced by $coord_ref. This is done via an iterator generated with vector_iterator() running from the first to the second coordinate given.

Example:

 dcopy([[1,2,3], [4,5,6], [7,8,9]], [[1,0], [2,1]]);

will return

  [ [4,5], [7,8] ]

This will work in either direction, so:

 dcopy([[1,2,3], [4,5,6], [7,8,9]], [[2,1], [1,0]]);

will give:

  [ [8,7], [5,4] ]

as expected.

idx()

idx($aref1, $aref2)

Return an index vector that contains the indices of the elements of the first argument vector with respect to the second index vector.

Example:

 idx([[1,3],[4,5]], [[1,2,3], [4,5,6], [7,8,9]]);

will return:

 [[[0,0],[0,2]],[[1,0],[1,1]]]

purge()

purge($aref, $what)

Remove all values from the array referenced by $aref that equal $what in a string comparison.

Example:

 $v = [1,0,1,0,1,0,1,0];
 purge($v, '0');

will have $v reduced to:

 [1,1,1,1]

remove()

remove($aref, $index|$coordinate_aref)

Remove all values with indices or coordinates given by $index or by the array referenced by $coordinate_aref from an array referenced by $aref.

Example:

 my $v = [1,2,3,4,5,6,7,8,9,0];
 remove($v, [1,2,3]);

will have $v reduced to:

 [1,5,6,7,8,9,0]

and:

 my $aref = [[1,2,3],[4,5,6],[7,8,9]];

 remove($aref, [[0,1], [1,2], 2]);

will leave:

 [[1,3],[4,5]]

in $aref.

reshape()

reshape($aref, $dims_aref [, $fill_aref])

Create an array with the dimension vector given in $dims_aref and take the values from $aref provided there is a value at the given position. Additional values will be taken from the array referenced by $fill_aref or - if it is not provided - from a flattened (call to collapse()) version of the original array referenced by $aref. If the fill source is exhausted, reshape will start from index 0 again. This will be repeated until the destination array is filled.

Example:

 reshape([[1,2,3]], [3, 3]);

will return:

 [ [1,2,3], [1,2,3], [1,2,3] ]

and:

 reshape([[1,2,3]], [3, 3], ['x']);

will return:

 [ [1,2,3], ['x','x','x'], ['x','x','x'] ]

rotate()

rotate($aref1, $aref2 [, $fill_aref])

Rotate a data structure along its axes. It is possible to perform more than one rotation at once, so rotating a two dimensional matrix along its x- and y-axes by +1 and -1 positions is no problem.

Example:

 rotate([[1, 2, 3], [4, 5, 6], [7, 8, 9]], [1, -1]);

will return:

 [[8,9,7],[2,3,1],[5,6,4]]

Using the optional third parameter it is possible to fill previously empty array elements with a given value via "reshape()".

scatter()

scatter($aref, $struct)

This function behaves inverse to subscript. While subscript selects values from a nested data structure, controlled by an index vector, scatter will distribute elements into a new data structure, controlled by an index vector.

Example:

 scatter([1, 2, 3, 4, 5, 6, 7], [[0,0], [0,1], [1,0], [1,1]]);

will return:

 [[1, 2], [3, 4]]

shape()

shape($aref)

Determine the dimensions of an array and return it as a vector (an array reference)

Example:

 shape([[1,2,3], [4,5,6], [7,8,9]]);

will return:

 [3,3]

and:

 shape([[1,2,3],4,[5,[6,7,8,[9,0]]]]);

will return:

 [3,3,4,2]

A combination of shape() and reshape() will effectively turn an "irregular" array into a regular one.

For example:

 $aref = [[1,2,3],4,[5,6],[7,8,9]];

 reshape($aref, shape($aref), [0]);

will return:

 [[1,2,3],[0,0,0],[5,6,0],[7,8,9]]

subscript()

subscript($aref, $index|$coord_aref)

Retrieve and return values of a deeply nested array for a single index a list of indices or a list of coordinate vectors.

Example:

 my $aref = [[1,2,3],[4,5,6],[7,8,9]];

 subscript($aref, 1);

returns:

 [[4,5,6]]

whereas:

 subscript($aref, [[0,1], [1,2], 2]);

returns:

 [2,6,[7,8,9]]

transpose()

transpose($aref1, $control [, $fill_aref])

Transpose a nested data structure. In the easiest two-dimensional case this is the traditional transposition operation.

Example:

 transpose([[1,2,3], [4,5,6], [7,8,9]], 1);

will return:

 [[1,4,7],[2,5,8],[3,6,9]]

Using the optional third parameter, it is possible to fill previously empty array elements with a given value via "reshape()".

unary()

unary($aref1, $subref, $neutral_element [, $object])

Recursively apply a unary operator represented by a subroutine reference to all elements of a nested data structure given in $aref and set the resulting values in the referenced array itself. The reference will also be returned.

The value of $neutral_element will be used if the original is undefined or does not exist. To be able to use methods as subroutines $object will be passed to the subroutine as first parameter when specified.

A simple example, after:

 my $v    = [1,0,2,0,3,[1,0,3]];
 my $func = sub { ! $_[0] + 0 };

 unary($v, $func);

will return:

 [1,0,2,0,3,[0,1,0]]

value_by_path()

value_by_path($aref, $coordinate [, $value [, $force]])

Get or set a value in a deeply nested array by a coordinate vector.

Example:

 my $vec = [[1,2,3], [4,5,6], [7,8,9]];

 value_by_path($vec, [1,1], 99);

will give:

 [[1,2,3], [4,99,6], [7,8,9]];

in $vec. This is not spectacular since one could easily write:

 $vec->[1][1] = 99;

but value_by_path() will be needed if the coordinate vector is created dynamically and can be of arbitrary length. If you explicitly want to set an undefined value, you have to set $force to a true value.

vector_iterator()

vector_iterator($from_aref, $to_aref)

This routine returns a subroutine reference to an iterator which is used to generate successive coordinate vectors starting with the coordinates in $from_aref to those in $to_aref.

The resulting subroutine will return a pair of coordinate vectors on each successive call or an empty list if the iterator has reached the last coordinate. The first coordinate returned is related to the given coordinate pair, the second one to a corresponding zero based array.

Example:

 my $aref = [[1,2,3], [4,5,6], [7,8,9]];

 my $iterator = vector_iterator([0,1], [1,2]);

 while ( my($svec, $dvec) = $iterator->() ) {
   my $val = value_by_path($aref, $svec);
   print "[$svec->[0] $svec->[1]] [$dvec->[0] $dvec->[1]] -> $val\n";
 }

will print:

 [0 1] [0 0] -> 2
 [0 2] [0 1] -> 3
 [1 1] [1 0] -> 5
 [1 2] [1 1] -> 6

SEE ALSO

Array::DeepUtils was developed during the implementation of lang5 a stack based array language. The source will be maintained in the source repository of lang5.

Bug Reports and Feature Requests

AUTHOR

Thomas Kratz <tomk@cpan.org>

Bernd Ulmann <ulmann@vaxman.de>

COPYRIGHT

Copyright (C) 2011 by Thomas Kratz, Bernd Ulmann

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.