The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Hash::MostUtils - Yet another collection of tools for operating pairwise on lists.

DESCRIPTION

This module provides a number of functions for processing hashes as lists of key, value pairs.

SYNOPSIS

  my @found_and_transformed =
      hashmap { uc($b) => 100 + $a }
      hashgrep { $a < 100 && $b =~ /[aeiou]/i } (
          1 => 'cwm',
          2 => 'apple',
          100 => 'cherimoya',
      );

  my @keys = lkeys @found_and_transformed;
  my @vals = lvalues @found_and_transformed;
  foreach my $key (@keys) {
      my $value = shift @vals;
      print "$key => $val\n";
  }

  while (my ($key, $val) = leach @found_and_transformed) {
      print "$key => $val\n";
  }

  my $serialized = join ',', hashsort { $a->{key} cmp $b->{key} } %hash;

EXPORTS

By default, none. On request, any of the following:

FUNCTIONS TO MAKE ARRAYS ACT LIKE HASHES

lkeys LIST

Return the "keys" of LIST. Perl's keys() keyword only operates on hashes; lkeys() offers an approximation of the same functionality for lists.

    my @evens = lkeys 1..10;

    my @keys  =
        lkeys                                     # give me back those keys (i.e. the letters)
        hashgrep { $b > 100 }                     # find key/value pairs where the value is > 100
        map { $_ => int(rand(1000)) } 'a'..'z';   # turn 'a'..'z' into key/value pairs with random values

The "keys" of a list are the even-positioned items. Note that in the case of an >empty slot< in a sparse array, the key will be undef.

lvalues LIST

Return the "values" of LIST. Perl's values() keyword only operates on hashes; lvalues() offers an approximation of the same functionality for lists.

    my @odds = lkeys 1..10;

    my @values =
        lvalues                                  # give me back those values (i.e. the letters)
        hashgrep { $a > 100 }                    # look for key/value pairs where the key is > 100
        map { int(rand(1000)) => $_ } 'a'..'z';  # make 26 random keys from 1-1000, with fixed keys

The "values" of a list are the odd-positioned items. Note that in the case of an >empty slot< in a sparse array, the value will be undef.

leach [ ARRAY | HASH | ARRAYREF | HASHREF ]

Iterate over an ARRAY, HASH, ARRAYREF, or HASHREF, returning successive "key/value" pairs. This behaves functionally identically to Perl's built-in each keyword; however, it is useful for arrays and array- and hash-references. This function handles objects which are built around blessed array- and hash-references.

    my @array = (1..4);

    while (my ($k, $v) = leach @array) {
        print "$k => $v\n";
    }

    print "$_\n" for @array;

    __END__
    1 => 2
    3 => 4
    1
    2
    3
    4

Using leach to gather key/value pairs from a collection is guaranteed to be non-destructive to that collection. One pattern that's useful for iterating arrays and arrary references in pairs is to use splice, which has the possibly unintended side effect of destroying the subject collection:

    my @array = (1..4);

    while (my ($k, $v) = splice @array, 0, 2) {
        print "$k => $v\n";
    }

    print "$_\n" for @array;

    __END__
    1 => 2
    3 => 4

Note the distinction between saying that this function is

    leach ARRAY

rather than

    leach LIST

Perl does not allow this behavior:

    while (my ($k, $v) = leach 1..10) {                   # can't leach a list, only an array
        # do something with this key/value tuple
    }

But don't worry, Perl also doesn't allow for this behavior:

    while (my ($k, $v) = splice 1..10, 0, 2) {            # can't splice a list, only an array
        # do something with this key/value tuple
    }

FUNCTIONS TO OPERATE ON LISTS, ARRAYS, AND HASHES AS TUPLES

hashmap, hashgrep, and hashapply all act like their corresponding map, grep, and List::Utils::apply but for one notable exception: whereas map, grep, and apply all eat items from the given list one-by-one and assign that current value to $_, hashmap, hashgrep, and hashapply all eat items from the given list two-by-two, and assigns them to $a and $b.

The names $a and $b were chosen because they're already in lexical scope in Perl due to sort's need for them.

If you have a singular occurance of $a and $b within your program, you will probably see this warning from Perl:

    Name 'main::a' used only once: possible typo at ...
    Name 'main::b' used only once: possible typo at ...

I've just gotten in the habit of adding:

    use strict;
    use warnings; no warnings 'once';

when I see that message.

hashmap BLOCK LIST

This acts similar to

    map BLOCK LIST

with the exception that map eats items off of LIST one at a time, assigning the current value to $_; whereas hashmap eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

    # naive transformation of this hash into (101 => 'A', 102 => 'B')
    my %hash = (
        a => 1,
        b => 2,
    );

    my %transformed =
        hashmap { $b + 100 => uc($a) }
        %hash;

Just like map, your BLOCK will be called without any arguments. Like perl's keyword map, this function maintains the order of LIST.

hashmap is simply a prototyped alias for n_map(2, CODEREF, LIST), so all of the documentation to n_map applies here.

hashgrep BLOCK LIST

This acts similar to

    grep BLOCK LIST

with the exception that grep eats items off of LIST one at a time, assigning the current value to $_; whereas hashgrep eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

    # lame object dumper
    my $object = Some::Class->new(...);

    my %dump =
        hashgrep { $a !~ /^_/ && ! ref($b) }   # hide private fields and internal data structures
        %$object;

Just like grep, your BLOCK will be called without any arguments. Like perl's keyword grep, this function maintains the order of LIST.

hashgrep is simply a prototyped alias for n_grep(2, CODEREF, LIST), so all of the documentation to n_grep applies here.

hashapply BLOCK LIST

This is similar to List::MoreUtils::apply:

    apply BLOCK LIST

with the usual exception: apply eats items off of LIST one at a time, assigning to $_; whereas hashapply eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

Normal apply can be written as map:

    my @words = qw(apple banana cherimoya); my @clean1 = map { tr/aeiou//d; $_ } @words; # @clean1 = @words = qw(ppl bnn chrmy);

    @words = qw(apple banana cherimoya); my @clean2 = apply { tr/aeiou//d } @words; # @clean2 = qw(ppl bnn chrmy); @words = qw(apple banana cherimoya);

Note that apply does not transform the original data, whereas map does. Similarly, hashapply does not transform the original data, whereas hashmap might.

Note that apply does not need to explicitly return $_, whereas map does. Similarly, hashapply does not need to explicitly return a key/value tuple ($a, $b), whereas hashmap does need to return something.

Like apply, hashapply will not transform the original LIST.

hashsort BLOCK LIST

Sort LIST by BLOCK, handling two tuples at a time. $a and $b will each have the form:

    $a = +{key => ..., value => ...};
    $b = +{key => ..., value => ...};

This call:

    my %hash = (a => 1, n => 14, m => 13, b => 2, z => 26);
    my @sorted =
      hashsort { $b->{key} cmp $a->{key} }
      %hash;

Is equivalent to this:

    my %hash = (a => 1, n => 14, m => 13, b => 2, z => 26);

    my @sorted =
      map { ($_->{key} => $_->{value}) }
      sort { $b->{key} cmp $a->{key} }
      map { +{key => $_, value => $hash{$_} }
      keys %hash;

hashsort is the sort-body of a Schwartzian transform over a list of tuples.

GENERIC N-ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS

With the exception of hashsort, each of the pairwise functions mentioned so far - leach, hashmap, hashgrep, hashapply - are actually implemented in terms of more generic N-ary forms. This means that if you need to process a list in sets of N, where N is > 2, you may use the n_* forms of these functions.

Variable naming becomes more interesting when moving beyond 2 items. Whereas $a and $b are always in lexical scope, once you go to N of 3, you need to agree on some variable naming convention.

$a and $b work nicely for the first two elements of a list; so $c is the third, and $d the fourth, and so on. One limitation of this naming scheme is that you may not easily go beyond N of 26 - but if you find yourself needing that, you'll find the code simple to extend.

In order to prevent 'strict refs' from complaining about $c..$z, you'll need to address those variables a bit differently:

    my @sets =
        n_map   6, sub { [$a, $b, $::c, $::d, $::e, $::f] },
        n_apply 3, sub { $_ *= 3 for $a, $b, $::c },
        n_grep  3, sub { $::c > 4 },
        (1..9);                             # @sets = ([12, 15, 18, 21, 24, 27]);

I personally find the transition between $b and $::c to be a bit jarring visually, so the one time I wrote a line like the above I chose to write it as $::a and $::b.

    my @sets =
        n_map   6, sub { [$::a, $::b, $::c, $::d, $::e, $::f] },
        n_apply 3, sub { $_ *= 3 for $::a, $::b, $::c },
        n_grep  3, sub { $::c > 4 },
        (1..9);                             # @sets = ([12, 15, 18, 21, 24, 27]);

n_each N, LIST

Iterate over LIST, returning successive "key/values" sets.

    my @list = (1..9);

    while (my ($k, @v) = n_each 3, @list) {
        # do something with this $k and @v
    }

There's nothing that says your N needs to remain constant:

    my @list = (
        a => 1,
        b => 1, 2,
        c => 1, 2, 3,
        d => 1, 2, 3, 4,
    );

    my $n = 2;

    my %triangle;
    while (my ($k, @v) = n_each $n++, @list) {
        $triangle{$k} = \@v;
    }

    __END__
    %triangle = (
        a => [1],
        b => [1, 2],
        c => [1, 2, 3],
        d => [1, 2, 3, 4],
    );

There's probably something clever that you can do with this that I just don't understand. Please drop me a line if you know what it is.

n_map N, CODEREF, LIST

map CODEREF over LIST, operating in N-sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.

See "GENERIC N-ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

    my @transformed = n_map(
        3,
        sub { "$a, $b $::c!\n" },
        qw(goodnight sweet prince goodbye cruel world),
    );

    # @transformed = ("goodnight, sweet prince!\n", "goodbye, cruel world!");

If you are consistently n_map'ping by some N, then you might consider wrapping n_map so the call syntax looks more like one of Perl's functional keywords:

    sub tri_map (&@) { unshift @_, 3; goto &n_map }

    my @transformed =
        tri_map { "$::a, $::b $::c!\n" }
        qw(goodnight sweet prince goodbye cruel world);

    # @transformed = ("goodnight, sweet prince!\n", "goodbye, cruel world!");

n_grep N, CODEREF, LIST

grep for CODEREF over LIST, operating in N-sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.

See "GENERIC N-ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

    my @found = n_grep(
        3,
        sub { $a =~ /good/ && $::c =~ /prince/ },
        qw(goodnight sweet prince goodbye cruel world),
    );

    # @found = qw(goodnight sweet prince);

Just as with n_map, writing a small bit of gloss to make your N of n_grep work in a functional manner is simple, and makes your code more readable:

    sub tri_grep (&@) { unshift @_, 3; goto &n_grep }

    my @found =
        tri_grep { $::a =~ /good/ && $::c =~ /prince/ }
        qw(goodnight sweet prince goodbye cruel world);

    # @found = qw(goodnight sweet prince);

n_apply N, CODEREF, LIST

List::Utils::apply CODEREF to LIST, operating in N-sized chunks. LIST must be evenly divisible by N.

See "GENERIC N-ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

    my @uppercase = n_apply(
        3,
        sub { uc $::c }
        qw(goodnight sweet prince goodbye cruel world),
    );

    # @uppercase = qw(goodnight sweet PRINCE goodbye cruel WORLD);

Just as with n_map, writing a small bit of gloss to make your N of n_apply work in a functional manner is simple, and makes your code more readable:

    sub tri_apply (&@) { unshift @_, 3; goto &n_apply }

    my @uppercase =
        tri_apply { uc $::c }
        qw(goodnight sweet prince goodbye cruel world);

    # @uppercase = qw(goodnight sweet PRINCE goodbye cruel WORLD);

GRAB BAG

I like these functions, but they're decidedly different from everything up to this point. They are mostly used to turn an existing hash reference or object into a smaller representation of itself.

hash_slice_of HASHREF, LIST

Looks into HASHREF and extracts the key/value pairs of the keys named in LIST. If a key in LIST is not present in HASHREF, returns undefined.

    my %hash = (1..10);

    my %slice = hash_slice_of \%hash, qw(5, 7, 9, 11);

    __END__
    %slice = (
        5 => 6,
        7 => 8,
        9 => 10,
        11 => undef,
    );

If you only want to get back key/value pairs for keys in LIST that exist in HASHREF, just add a hashgrep:

    my %hash = (1..10);

    my %slice =
        hashgrep { exists $hash{$a} }
        hash_slice_of \%hash, qw(5, 7, 9, 11);

    __END__
    %slice = (
        5 => 6,
        7 => 8,
        9 => 10,
    );

hash_slice_by OBJECT, LIST

Calls the methods named in LIST on OBJECT and returns a hash of the results. If a method in LIST can not be performed on OBJECT, you will get the standard "Can't call method ->... on object" error that Perl throws in this circumstance.

    my $object = ...;
    my %out = hash_slice_by $object, qw(foo bar baz);

    __END__
    %out = (
        foo => 'output of foo',
        bar => 'output of bar',
        baz => 'output of baz',
    );

Note that you may not use hash_slice_by to pass arguments to the methods given in LIST. Note too that your methods are invoked in scalar context.

rekey BLOCK HASH

Rename the keys in HASH by the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.

    my %hash = (crow => 'black', snow => 'white', libro => 'read all over');
    my %spanish = rekey { crow => 'corvino', snow => 'nieve' } %hash;

    __END__
    %spanish = (
        corvino => 'black',
        nieve   => 'white',
        libro   => 'read all over',
    );

revalue BLOCK HASH

Rename the values in HASH to the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.

    my @start = (apple => 'red', apple => 'green');
    my @translated = revalue { red => 'rojo', green => 'verde' } @start;

    __END__
    @translated = (
        apple => 'rojo',
        apple => 'verde',
    );

reindex BLOCK LIST

Reorder the values in LIST by the mapping table provided by BLOCK. LIST may be either an array or a list. In general this function will not work on hashes.

    my @array = (1..5);
    my @reindexed = reindex { map { $_ => $_ + 1 } 0..$#array } @array;

    __END__
    @reindexed = (undef, 1..5);

ACKNOWLEDGEMENTS

The names and behaviors of most of these functions were initially developed at AirWave Wireless, Inc. I've re-implemented them here.

This software would be trapped on my hard drive were it not for Logan Bell's encouragement to release it. Separating the personal time I have put into this from the professional time afforded by my employer, Shutterstock, Inc. would be very difficult. Thankfully I haven't needed to; when I asked to share this, Dan McCormick simply said, "Go for it! Thanks for hacking."

COPYRIGHT AND LICENSE

    (c) 2013 by Belden Lyman

This library is free software: you may redistribute it and/or modify it under the same terms as Perl itself; either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.