Iterator::Flex::Manual::Basics - Iterator Basics
version 0.12
An iterator is something which encapsulates a source of data and parcels it out one chunk at a time. Iterators usually need to keep track of the state of the data stream and which chunk they should next return.
For example, imagine iterating through an array, returning one element at a time. The state required would be the array and the index of the next element to return. Here's a simple iterator which uses a hash to keep track of state
sub iterate ( $state ) { my $array = $state->{array}; return $state{index} > $#$array ? undef : $array->[$state{index}++]; }
We could use this via:
my %state = ( array => [ 0.. 20 ], index => 0 ); while ( defined( my $value = iterate( \%state ) ) ) { say $value; }
This illustrates the three typical phases of an iterator:
Initialized: The iterator's state has been set up.
Iteration: The iterator has returned at least one element of data, but may not know if there are more.
Exhaustion: The iterator has definitely run out of data.
(There's a fourth state, Error.)
Exhaustion is traditionally signaled via:
Returning a sentinel value;
Throwing an exception.
Setting a Boolean predicate in a multi-valued return, e.g.
{ value => $value, success => $bool }
There's no right way to do it, just different trade-offs; see Iterator::Flex::Manual::PriorArt for how other languages and Perl modules handle it.
Returning a sentinel value is often good enough, but only if that value doesn't exist in your data stream. In our example iterator, it returns undef when it has exhausted the data source. However, imagine that the array contains temperature measurements taken at uniform intervals; an undef value may indicate that there was a problem taking a measurement (similar to how one would use null in a database), e.g.
undef
null
my @Temp = ( 22, 23.2, undef, 24, ... );
The iterator itself happily keeps going until it runs out of data, but when it returns the undef value, our example code above interprets it as the iterator signaling exhaustion and will stop querying the iterator. Obviously that's wrong.
One option is to use a value that knowingly can't occur. If your temperature is measured in Kelvin, which is always positive, a negative value can be a sentinel. However, that requires that the sentinel value be an input parameter to the iterator.
Iterator::Flex provides a signal_exhaustion method which currently supports either returning a user defined sentinel or throwing an exception.
signal_exhaustion
Similar issues arise when the iterator must signal an error. For example, if the iterator retrieves from a database and there is a connection issue, the client code must be alerted. This can be done via any of the methods specified in "Iterator Exhaustion".
Most implementations (language or Perl modules) don't provide an explicit specification of how to handle this. Iterator::Flex provides a signal_error method which currently supports throwing an exception.
signal_error
Apart from state, an iterator is mostly defined by its capabilities. The only one required is "next", which retrieves a value,
There are a limited set of additional capabilities which are not appropriate to all data sources or iterators, so they are optional.
Some capabilities can be emulated by iterator adapters. The supported capabilities are documented in Iterator::Flex::Manual::Overview, and are
next
current
prev
rewind
reset
freeze
thaw
An iterator generator creates an iterator from a data source, which may be real (such as a data structure in memory, a database, etc.), or virtual (such as a sequence of numbers). Iterator::Flex provides iterator generators via convenience wrappers and classes for: arrays (iarray, Iterator::Flex::Array), numeric sequences (iseq, Iterator::Flex::Sequence), array like objects (Iterator::Flex::ArrayLike).
For others, writing an iterator is straightforward; see Iterator::Flex::Manual::Authoring.
An iterator adapter acts as a filter or modifier on the output of another iterator. Applying an adapter to an iterator results in another iterator, which can be used as input to another adapter.
Iterator::Flex provides adapters both via convenience wrappers and classes for
igrep, Iterator::Flex::Grep)
imap, Iterator::Flex::Map
icycle, Iterator::Flex::Cycle
iproduct, Iterator::Flex::Product
icache, Iterator::Flex::Cache
ifreeze, Iterator::Flex::Freeze
There are a number of existing iterator packages on CPAN (see Iterator::Flex::Manual::PriorArt). Iterator::Flex can wrap those iterators so that they can be used within the Iterator::Flex framework. See Iterator::Flex::Manual::Alien.
Please report any bugs or feature requests to bug-iterator-flex@rt.cpan.org or through the web interface at: https://rt.cpan.org/Public/Dist/Display.html?Name=Iterator-Flex
Source is available at
https://gitlab.com/djerius/iterator-flex
and may be cloned from
https://gitlab.com/djerius/iterator-flex.git
Please see those modules/websites for more information related to this module.
Iterator::Flex
Iterator::Flex::Manual
Diab Jerius <djerius@cpan.org>
This software is Copyright (c) 2018 by Smithsonian Astrophysical Observatory.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007
To install Iterator::Flex, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Iterator::Flex
CPAN shell
perl -MCPAN -e shell install Iterator::Flex
For more information on module installation, please visit the detailed CPAN module installation guide.