The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

String::Base - string index offseting

SYNOPSIS

    use String::Base +1;

    no String::Base;

DESCRIPTION

This module implements automatic offsetting of string indices. In normal Perl, the first character of a string has index 0, the second character has index 1, and so on. This module allows string indexes to start at some other value. Most commonly it is used to give the first character of a string the index 1 (and the second 2, and so on), to imitate the indexing behaviour of FORTRAN and many other languages. It is usually considered poor style to do this.

The string index offset is controlled at compile time, in a lexically-scoped manner. Each block of code, therefore, is subject to a fixed offset. It is expected that the affected code is written with knowledge of what that offset is.

Using a string index offset

A string index offset is set up by a use String::Base directive, with the desired offset specified as an argument. Beware that a bare, unsigned number in that argument position, such as "use String::Base 1", will be interpreted as a version number to require of String::Base. It is therefore necessary to give the offset a leading sign, or parenthesise it, or otherwise decorate it. The offset may be any integer (positive, zero, or negative) within the range of Perl's integer arithmetic.

A string index offset declaration is in effect from immediately after the use line, until the end of the enclosing block or until overridden by another string index offset declaration. A declared offset always replaces the previous offset: they do not add. "no String::Base" is equivalent to "use String::Base +0": it returns to the Perlish state with zero offset.

A declared string index offset influences these types of operation:

  • substring extraction (substr($a, 3, 2))

  • substring splicing (substr $a, 3, 2, "x")

  • substring searching (index($a, "x"), index($a, "x", 3), rindex($a, "x"), rindex($a, "x", 3))

  • string iterator position (pos($a))

Only forwards indexing, relative to the start of the string, is supported. End-relative indexing, normally done using negative index values, is not supported when an index offset is in effect. Use of an index that is numerically less than the index offset will have unpredictable results.

Differences from $[

This module is a replacement for the historical $[ variable. In early Perl that variable was a runtime global, affecting all array and string indexing in the program. In Perl 5, assignment to $[ acts as a lexically-scoped pragma. $[ is deprecated. The original $[ was removed in Perl 5.15.3, and later replaced in Perl 5.15.5 by an automatically-loaded arybase module. This module reimplements the index offset feature without any specific support from the core.

Unlike $[, this module does not affect indexing into arrays. This module is concerned only with strings. To influence array indexing, see Array::Base.

This module does not show the offset value in $[ or any other accessible variable. With the string offset being lexically scoped, there should be no need to write code to handle a variable offset.

$[ has some predictable, but somewhat strange, behaviour for indexes less than the offset. The behaviour differs between substring extraction and iterator positioning. This module does not attempt to replicate it, and does not support end-relative indexing at all.

The string iterator position operator (pos($a)), as implemented by the Perl core, generates a magical scalar which is linked to the underlying string. The numerical value of the scalar varies if the iterator position of the string is changed, and code with different $[ settings will see accordingly different values. The scalar can also be written to, to change the position of the string's iterator, and again the interpretation of the value written varies according to the $[ setting of the code that is doing the writing. This module does not replicate any of that behaviour. With a string index offset from this module in effect, pos($a) evaluates to an ordinary rvalue scalar, giving the position of the string's iterator as it was at the time the operator was evaluated, according to the string index offset in effect where the operator appears.

PACKAGE METHODS

These methods are meant to be invoked on the String::Base package.

String::Base->import(BASE)

Sets up a string index offset of BASE, in the lexical environment that is currently compiling.

String::Base->unimport

Clears the string index offset, in the lexical environment that is currently compiling.

BUGS

B::Deparse will generate incorrect source when deparsing code that uses a string index offset. It will include both the pragma to set up the offset and the munged form of the affected operators. Either the pragma or the munging is required to get the index offset effect; using both will double the offset. Also, the code generated for a string iterator position (pos($a)) operation involves a custom operator, which B::Deparse can't understand, so the source it emits in that case is completely wrong.

The additional operators generated by this module cause spurious warnings if some of the affected string operations are used in void context.

Prior to Perl 5.9.3, the lexical state of string index offset does not propagate into string eval.

SEE ALSO

Array::Base, arybase, "$[" in perlvar

AUTHOR

Andrew Main (Zefram) <zefram@fysh.org>

COPYRIGHT

Copyright (C) 2011, 2012, 2017 Andrew Main (Zefram) <zefram@fysh.org>

LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.