The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::Integer::Gapfillers - Fill in the gaps between integer ranges

SYNOPSIS

    use Set::Integer::Gapfillers;
    $gf = Set::Integer::Gapfillers->new(
        lower   => -12,
        upper   =>  62,
        sets    => [
            [  1, 17 ],     # Note:  Use comma, not 
            [ 25, 42 ],     # range operator (..)
            [ 44, 50 ],
        ],
    );

    $segments_needed_ref = $gf->segments_needed();

    $gapfillers_ref      = $gf->gapfillers();

    $all_segments_ref    = $gf->all_segments();

Any of the three preceding output methods can also be called with an expand option:

    $segments_needed_ref = $gf->segments_needed( expand => 1 );

DESCRIPTION

This Perl extension provides methods which may be useful in manipulating sets whose elements are consecutive integers. Suppose that you are provided with the following non-intersecting, non-overlapping sets of consecutive integers:

    {  1 .. 17 } 
    { 25 .. 42 } 
    { 44 .. 50 }

Suppose further that you are provided with the following lower and upper bounds to a range of consecutive integers:

    lower:  12
    upper:  62

Provide a set of sets which:

  • when joined together, would form a set of consecutive integers from the lower to the upper bound, inclusive; and

  • are derived from:

    • the sets provided;

    • proper subsets thereof; or

    • newly generated sets which fill in the gaps below, in between or above the provided sets.

Once a Set::Integer::Gapfillers object has been constructed, its segments_needed() method can be used to provide these results:

    { 12 .. 17 }    # subset of 1st set provided
    { 18 .. 24 }    # gap-filler set
    { 25 .. 42 }    # 2nd set provided
    { 43 .. 43 }    # gap-filler set
                    # (which happens to consist of a single element)
    { 44 .. 50 }    # 3rd set provided
    { 51 .. 62 }    # gap-filler set for range above highest provided set

Alternatively, you may only wish to examine the gap-filler sets. The gapfillers() method provides this set of sets.

    { 18 .. 24 }    # gap-filler set
    { 43 .. 43 }    # gap-filler set
    { 51 .. 62 }    # gap-filler set

And, as an additional alternative, you may wish to have your set of sets begin or end with all the values of a given provided set, rather than a proper subset thereof containing only those values needed to populate the desired range. In that case, use the all_segments() method.

    {  1 .. 17 }    # 1st set provided
    { 18 .. 24 }    # gap-filler set
    { 25 .. 42 }    # 2nd set provided
    { 43 .. 43 }    # gap-filler set
                    # (which happens to consist of a single element)
    { 44 .. 50 }    # 3rd set provided
    { 51 .. 62 }    # gap-filler set for range above highest provided set

The results returned by the all_segments() method differ from those returned by the segments_needed() method only at the lower or upper ends. If, as in the above example, the lower bound of the target range of integers falls inside a provided segment, the first set returned by all_segments() will be the entire first set provided; the first set returned by segments_needed() will be a proper subset of the first set provided, starting with the requested lower bound.

USAGE

Publicly Callable Methods

new()

    $gf = Set::Integer::Gapfillers->new(
        lower   => -12,
        upper   =>  62,
        sets    => [
            [  1, 17 ],     # Note:  Use comma, not 
            [ 25, 42 ],     # range operator (..)
            [ 44, 50 ],
        ],
    );

Purpose: Constructor of a Set::Integer::Gapfillers object.

Arguments: List of key-value pairs. lower and upper take integers denoting the lower and upper bounds of the range of integers desired as the result. sets takes a reference to an anonymous array whose elements are, in turn, references to anonymous arrays whose two elements are the lowest and highest numbers in a range of consecutive integers.

Note: The sets of consecutive integers supplied must be non-overlapping. Set::Integer::Gapfillers will croak if supplied with arguments such as these:

    $gf = Set::Integer::Gapfillers->new(
        lower   => -12, upper   =>  62,
        sets    => [
            [  1, 30 ],   # no good:  overlaps with next set
            [ 25, 48 ],   # no good:  overlaps with previous and next sets 
            [ 44, 50 ],   # no good:  overlaps with previous set
        ],
    );

Note: Only two elements should be supplied in the anonymous arrays supplied as elements to the array reference which is the value of sets: the lowest and highest (or, first and last) elements in each array. You should not use Perl's range operator (e.g., [ 25 .. 48 ]) in this instance.

Returns: A Set::Integer::Gapfillers object.

segments_needed()

    $segments_needed_ref   = $gf->segments_needed();

Purpose: Generate a set of sets which (a) when joined together, would form a set of consecutive integers from the lower to the upper bound, inclusive; and (b) are derived from (i) the sets provided; (ii) proper subsets thereof; or (iii) newly generated sets which fill in the gaps below, in between or above the provided sets.

Arguments: None required. expand = 1> is optional (see FAQ).

Returns: A reference to an anonymous array whose elements are, in turn, anonymous arrays of two elements: the lowest and highest integers in a particular subset. But when the expand option is set, the return value is a reference to an anonymous array whose elements are, in turn, references to arrays each of which holds the entire range of each set needed -- not just the beginning and end points.

gapfillers()

    $gapfillers_ref = $gf->gapfillers();

Purpose: Generate a set of the newly generated sets needed to fill in the gaps below, in between or above the sets provided to the constructor. The sets, like those returned by segments_needed(), are denoted by their lower and upper bounds rather than by their entire contents.

Arguments: None required. expand = 1> is optional (see FAQ).

Returns: A reference to an anonymous array whose elements are, in turn, anonymous arrays holding two elements: the lower and upper bounds of the integer ranges needed to provide gap-filling as described in 'Purpose'. When the expand option is set, the contents of those inner sets are expanded to include the full range of integers needed, not just the beginning and end points.

all_segments()

    $all_segments_ref = $gf->all_segments();

Purpose: Generate a set of all sets needed in order to populate a set of consecutive integers from the lower to the upper bound, inclusive. The sets generated are derived from (a) the sets provided or (b) newly generated sets which fill in the gaps below, in between or above the provided sets.

Arguments: None required. expand = 1> is optional (see FAQ).

Returns: A reference to an anonymous array whose elements are, in turn, anonymous arrays holding the sets described in 'Purpose'. When the expand option is set, the contents of those inner sets are expanded to include the full range of integers needed, not just the beginning and end points.

FAQ

1. How do the sets returned by the three non-constructor methods differ from one another?

With segments_needed(), the objective is: Show me whether the integers I need to fill the desired range will come from sets already provided or from newly created gap-filling sets.

With gapfillers(), the objective is: Show me the sets of integers I will need to create to fill the gaps between the sets already provided.

With all_segments(), the objective is: Show me all the sets of integers -- those already provided and those I will have to create -- from which I will pull integers to populate the desired range.

Here are two examples:

  •     $gf = Set::Integer::Gapfillers->new(
            lower   =>  10,
            upper   =>  22,
            sets    => [
                [  9, 11 ],
                [ 15, 18 ],
                [ 20, 24 ],
            ],
        );

    The three non-constructor methods return sets as follows:

                          9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
        segments_needed:     A  A  B  B  B  C  C  C  C  D  E  E  E
        gapfillers:                B  B  B              D
        all_segments:     A  A  A  B  B  B  C  C  C  C  D  E  E  E  E  E

    ... where A, C and E are elements coming from provided sets and B and D are coming from newly-created gap-filling sets.

  •     $gf = Set::Integer::Gapfillers->new(
            lower   =>  10,
            upper   =>  22,
            sets    => [
                [  9, 11 ],
                [ 15, 18 ],
                [ 20, 20 ],
            ],
        );

    The three non-constructor methods return sets as follows:

                          9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
        segments_needed:     A  A  B  B  B  C  C  C  C  D  E  F  F
        gapfillers:                B  B  B              D     F  F
        all_segments:     A  A  A  B  B  B  C  C  C  C  D  E  F  F       

    ... where A, C and E are elements coming from provided sets and B, D and F are coming from newly-created gap-filling sets.

2. Why do the output methods, by default, return references to two-element arrays rather than the full range of integers needed?

Memory and speed.

In an earlier implementation, Set::Integer::Gapfillers calculated its return values by supplying the constructor's sets argument with a list of references to arrays of consecutive integers -- [ 12 .. 22 ] -- rather than a list of references to two-element arrays of the lower and upper bounds of the integer ranges desired -- [ 12, 22 ]. All internal calculations were made by comparing the lower and upper bounds supplied with the arrays supplied. This proved to be a memory hog and slow.

Set::Integer::Gapfillers was then revised to require the user to supply only the beginning and end points of the provided segments. Although this complicated the logic of the internal calculations for the module author, it led to a vastly reduced memory footprint and vast speedup in producing results. It was therefore decided to make the output methods return values in the same manner, i.e., beginning and end points of ranges, rather than the entire ranges.

However, what an end-user of Set::Integer::Gapfillers might really be after is those entire ranges. Hence, the expand = 1> option is provided so that the results look like this:

    $gf = Set::Integer::Gapfillers->new(
        lower   => -12,
        upper   =>  62,
        sets    => [
            [  1, 17 ],
            [ 25, 42 ],
            [ 44, 50 ],
        ],
    );

    $segments_needed_ref = $gf->( expand => 1);
    __END__
    $segments_needed_ref: [
        [-12 ..  0 ],   # without 'expand':  [ -12, 0 ]
        [  1 .. 17 ],
        [ 18 .. 24 ],
        [ 25 .. 42 ],
        [ 43 .. 43 ],
        [ 44 .. 50 ],
        [ 51 .. 62 ],
    ]

BUGS

None reported so far.

SUPPORT

Via e-mail to author at address below.

AUTHOR

    James E Keenan
    CPAN ID: JKEENAN
    jkeenan@cpan.org
    http://search.cpan.org/~jkeenan/

ACKNOWLEDGEMENTS

This Perl extension has its origin in a question I posed on Perlmonks (http://perlmonks.org/?node_id=539350). BrowserUK's response (http://perlmonks.org/?node_id=539357) was ingenious and terse and led me to think that the solution could be modularized. However, when I realized that my original question had not fully specified my objective, I found I could no longer use BrowserUK's algorithm and had to work my own out -- so any bugs are my fault, not his!

COPYRIGHT

Copyright 2006. James E. Keenan. United States.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

SEE ALSO

perl(1). During the Perlmonks thread mentioned in ACKNOWLEDGMENTS, reference was made to CPAN module Set::Infinite (http://search.cpan.org/dist/Set-Infinite/), and specifically to Set::Infinite::minus() as possibly providing another solution to the gap-filling problem. Set::Array and Set::Scalar should also be consulted if you need a wider arrange of methods to perform set operations.