Ingy döt Net


String::Slice - Shared Memory Slices of Bigger Strings



This document describes String::Slice version 0.08. ";


    # Exports 'slice':
    use String::Slice;

    # String::Slice is for parsing across huge strings:
    my $buffer = get_large_buffer;

    # Make an SV to turn into a slice:
    my $slice = '';

    # Move (parse) forward across the $buffer in small increments:
    for (my $pos = 0; $pos <= length $buffer;) {

      # $pos is the current postion in buffer.
      # $slice retains info from previous call:
      slice($slice, $buffer, $pos);

      # front-anchored means /\A…/
      my $regex = get_small_front_anchored_regex;

      # Match may fail. Stay at same $pos if fail:
      $slice =~ $regex or next;

      # Add length of match:
      $pos += $+[0];


Processing large strings in Perl is inefficient because to access any smaller portion of a buffer you need to make a copy of that portion. Also finding substr offsets in large utf8 strings requires looping, since each character can have a varying length.

This module lets you make a string scalar (a "slice") point to a portion of the content of another string scalar. Finding the next slice is based on the position of the previous slice, so hopping over utf8 is much faster.

The primary goal of this module is to make the parsing large data much faster in Perl.


String::Slice exports one function: slice. It can be called in a few different ways:

    slice($slice_variable, $big_buffer_variable, $char_offset, $char_length)

This effectively makes $slice_variable a substr of the buffer, providing a faster, and more memory efficient way of doing:

    $slice_variable = substr($big_buffer_variable, $char_offset, $char_length);

The offset defaults to 0, and if no length is given, the slice goes to the end of the buffer.

If $slice is already a slice of $buffer then this call:

    slice($slice, $buffer, $offset, $length)

will subtract the previous offset (stashed in the slice internally) from the current offset and hop the difference. (This may be forwards or backwards).

If length is too long, the slice will go to the end of the buffer.

One side effect of this function is that both strings will become readonly, and the memory will not be freed until they both go out of scope.

The slice function returns 1 on success and 0 on failure. Failure occurs if the requested offset is invalid (less than the start or greater than or equal to the end of the buffer).


These people provided invaluable help:

  • Jan Dubois

  • Florian Ragwitz


Ingy döt Net <>


Copyright 2012-2015. Ingy döt Net.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.