The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Sereal::Path::Iterator - iterate and partly deserialize Sereal documents

SYNOPSIS

  use Sereal::Encoder qw/encode_sereal/;
  use Sereal::Path::Iterator;

  my $data = encode_sereal(
      [
        { foo => 'far' },
        { bar => 'foo' },
      ]
  );

  my $spi = Sereal::Path::Iterator->new($data);
  $spi->step_in();          # step inside array
  $spi->next();             # step over first hashref
  my $r = $spi->decode();   # decode second hashref

DESCRIPTION

Sereal::Path::Iterator is a way to iterate over serialized Sereal documents and decode them partly. For example, given a serialized array of 10 elements you can deserialize only 3rd one. Or, given huge hash, you can decode only key "foo" without touching others.

Sereal::Path::Iterator has internal state (or pointer/position/offset) which reflects current position in parsed document. By calling step_in, step_out, next, rewind or other function describe below you can change the state and move forward or backward through the document.

Apart of maintaining position, Sereal::Path::Iterator also has a stack where each new element on the stack represent new level in the document. For instance, in following example the arrayref is located at depth 0, hashref is at depth 1 and strings are at depth 2.

  [
    {
        foo => 'bar'
    }
  ]

METHODS

new

This method creates new iterator object. It optionally accepts serialized Sereal document as first argument and uses set to set it.

  my $spi = Sereal::Path::Iterator->new(encode_sereal({}));

set

As alternative to passing serialized document to new you can call this function to achieve same result.

  my $spi = Sereal::Path::Iterator->new();
  $spi->set(encode_sereal({}));

The current position is set to be very first top level element of parse document.

reset

This method resets iterator's internal state. The result will be equal to calling set function but without possible overheads on decompressing document.

  my $spi = Sereal::Path::Iterator->new(encode_sereal({}));
  # some actions here ...
  $spi->reset(); # return iterator to pre-actions state

eof

eof returns true value if end of a document is reached and false value otherwise.

info

info inspects serialized object at current position and returns information about it. It expects no input arguments and returns arrays. The content of returned array depends on nature of inspected object. But it contains at least one element - type.

1) Type - an integer with some bits set encoding object's information. Sereal::Path::Iterator exports following constants to works with the bitset:
SRL_INFO_REF object is a reference
SRL_INFO_HASH object is a hash
SRL_INFO_ARRAY object is an array
SRL_INFO_SCALAR object is scalar
SRL_INFO_REGEXP object is regexp
SRL_INFO_BLESSED object is blessed
SRL_INFO_REF_TO object is a reference to something

Examples:

    Serialized object           Type value
    ____________________________________________________________
    'string'                    SRL_INFO_SCALAR
    []                          SRL_INFO_REF_TO | SRL_INFO_ARRAY
    {}                          SRL_INFO_REF_TO | SRL_INFO_HASH
    \'string'                   SRL_INFO_REF_TO | SRL_INFO_SCALAR
    \\'string'                  SRL_INFO_REF_TO | SRL_INFO_REF
    \[]                         SRL_INFO_REF_TO | SRL_INFO_REF
    bless [], "Foo"             SRL_INFO_REF_TO | SRL_INFO_HASH | SRL_INFO_BLESSED
2) Length. If type has SRL_INFO_REF_TO bit set the second element in array reflects length underlying array or hash. If underlying object is also a reference the value is 1.

Examples:

    Serialized object           Length
    ____________________________________________________________
    [] or {}                    0
    [ 1, 2, 3 ]                 3
    { foo => 'bar' }            1
    bless [ 'test' ], "Foo"     1

Note that due to the way Sereal encodes hashes actual amount of elements inside a hash will be twice more than what length value tell you. This is because each key and value in the hash is represented as two object. One for key (which is always at even index) and one for value (which is always at odd index).

3) Class name. If type has SRL_INFO_BLESSED bit set than third element in returned array contains classname of blesses object.

Examples:

    Serialized object           Class name
    ____________________________________________________________
    bless [ 'test' ], "Foo"     Foo

step_in

step_in function has two cases:

1) a reference object is being parsed (SRL_INFO_REF_TO). In this case the function step inside the object and let caller inspect its content. You can think about step_in as dereferencing the reference. The function increment iterator's stack depth by one.
2) in other cases function acts similarly to next.

step_in has single optional input argument. It's number of steps to do. Default value is 1.

step_out

This function is opposite to step_in. Assuming that current stack's depth is N, the function moves current position forward until next element at level N-1 is reached. Consider following:

    [
        { foo => 'bar' },
        100,
    ]

If current position of iterator is any of values inside hashref, a call to step_out makes iterator be at value 100.

step_out, similarly to step_in, accepts single optional input argument which is number of steps to do. Default value is 1.

next

next does stepping over current element without investigating its content. Its main use case is to skip complex data structures. For instance, given

    [
        [ array of 1K elements ],
        { bar => 'foo' },
    ]

a sequence of single step_in and a call to next would result into iterator be at the hashref (but not inside it).

The function guarantees to remain on same stack depth.

next also accepts single optional input argument which is number of steps to do. Default value is 1.

rewind

This function rewinds current position to first element at current depth. For example:

    [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]

if current position is at element 5 of the array, a call to rewind brings it back to 1.

rewind accepts single optional argument. If a position integer is passed than rewind also act similarly to step_out. However, it doesn't move current position forward but rather backward.

array_goto

array_goto accepts single argument being index of element to go at current depth. If such element exists, the function moves current position to given index. array_goto rewinds stack if necessary. The function croaks if given index is outside of array boundaries.

array_exists

array_exists returns non-negative value if given index exists and -1 otherwise.

hash_exists

hash_exists returns non-negative value if given hash key exists and -1 otherwise. If key exists, the functions stops at key's value.

Note that hash_exists does search by linearly scanning entire hash until either key is found or end of hash is reached. It's an O(n) operation.

hash_key

hash_key assumes that current iterator's position is a hash key. If so, the function deserializes and returns the key.

stack_depth

stack_depth returns current stack's depth.

stack_index

stack_index returns current index of an element at current depth.

stack_length

stack_length returns total amount of elements at current depth. Please note that value returned by this function does not always match with length returned by info. In particular, length for hashes will be twice bigger.

decode

decode decodes object at current position. Also check "KNOWN ISSUES".

decode_and_next

decode_and_next is experimental method combining decode and next in one call. Internal optimizations let's this method avoiding double parsing which happens if one uses decode followed by next.

KNOWN ISSUES

decode do not bless decoded values
decode do not weaken decoded values
decode do not alias decoded values

AUTHOR

Ivan Kruglov <ivan.kruglov@yahoo.com>

CONTRIBUTORS

Roman Studenikin <roman.studenikin@booking.com>

Steven Lee <stevenwh.lee@gmail.com>

Gonzalo Diethem <gonzalo.diethelm@gmail.com>

COPYRIGHT AND LICENCE

Copyright 2014-2017 Ivan Kruglov.

This module is tri-licensed. It is available under the X11 (a.k.a. MIT) licence; you can also redistribute it and/or modify it under the same terms as Perl itself.

a.k.a. "The MIT Licence"

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE