NAME

PDL::NiceSlice - toward a nicer slicing syntax for PDL

SYNOPSYS

use PDL::NiceSlice;

$x(1:4) .= 2;             # concise syntax for ranges
print $y((0),1:$end);     # use variables in the slice expression
$x->transpose->(($pos-1)) .= 0; # default method syntax

$idx = long 1, 7, 3, 0;   # an ndarray of indices
$x(-3:2:2,$idx) += 3;     # mix explicit indexing and ranges
$x->clump(1,2)->(0:30);   # 'default method' syntax
$x(myfunc(0,$var),1:4)++; # when using functions in slice expressions
                          # use parentheses around args!

$y = $x(*3);              # Add dummy dimension of order 3

# modifiers are specified in a ;-separated trailing block
$x($x!=3;?)++;            # short for $x->where($x!=3)++
$x(0:1114;_) .= 0;        # short for $x->flat->(0:1114)
$y = $x(0:-1:3;|);        # short for $x(0:-1:3)->sever
$n = sequence 3,1,4,1;
$y = $n(;-);              # drop all dimensions of size 1 (AKA squeeze)
$y = $n(0,0;-|);          # squeeze *and* sever
$c = $x(0,3,0;-);         # more compact way of saying $x((0),(3),(0))

DESCRIPTION

Slicing is a basic, extremely common operation, and PDL's "slice" in PDL::Slices method would be cumbersome to use in many cases. PDL::NiceSlice rectifies that by incorporating new slicing syntax directly into the language via a perl source filter (see perlfilter). NiceSlice adds no new functionality, only convenient syntax.

NiceSlice is loaded automatically in the perldl or pdl2 shell, but (to avoid conflicts with other modules) must be loaded explicitly in standalone perl/PDL scripts (see below). If you prefer not to use a prefilter on your standalone scripts, you can use the "slice" in PDL::Slices method in those scripts, rather than the more compact NiceSlice constructs.

Use in scripts and perldl or pdl2 shell

The new slicing syntax can be switched on and off in scripts and perl modules by using or unloading PDL::NiceSlice.

But now back to scripts and modules. Everything after use PDL::NiceSlice will be translated and you can use the new slicing syntax. Source filtering will continue until the end of the file is encountered. You can stop sourcefiltering before the end of the file by issuing a no PDL::NiceSlice statement.

Here is an example:

use PDL::NiceSlice;

# this code will be translated
# and you can use the new slicing syntax

no PDL::NiceSlice;

# this code won't
# and the new slicing syntax will raise errors!

See also Filter::Simple and example in this distribution for further examples.

NOTE: Unlike "normal" modules you need to include a use PDL::NiceSlice call in each and every file that contains code that uses the new slicing syntax. Imagine the following situation: a file test0.pl

# start test0.pl
use PDL;
use PDL::NiceSlice;

$x = sequence 10;
print $x(0:4),"\n";

require 'test1.pl';
# end test0.pl

that requires a second file test1.pl

# begin test1.pl
$aa = sequence 11;
print $aa(0:7),"\n";
1;
# end test1.pl

Following conventional perl wisdom everything should be alright since we used PDL and PDL::NiceSlice already from within test0.pl and by the time test1.pl is required things should be defined and imported, etc. A quick test run will, however, produce something like the following:

 perl test0.pl
[0 1 2 3 4]
syntax error at test1.pl line 3, near "0:"
Compilation failed in require at test0.pl line 7.

This can be fixed by adding the line

use PDL::NiceSlice;

before the code in test1.pl that uses the new slicing syntax (to play safe just include the line near the top of the file), e.g.

# begin corrected test1.pl
use PDL::NiceSlice;
$aa = sequence 11;
print $aa(0:7),"\n";
1;
# end test1.pl

Now things proceed more smoothly

 perl test0.pl
[0 1 2 3 4]
[0 1 2 3 4 5 6 7]

Note that we don't need to issue use PDL again. PDL::NiceSlice is a somewhat funny module in that respect. It is a consequence of the way source filtering works in Perl (see also the IMPLEMENTATION section below).

evals and PDL::NiceSlice

Due to PDL::NiceSlice being a source filter it won't work in the usual way within evals. The following will not do what you want:

$x = sequence 10;
eval << 'EOE';

use PDL::NiceSlice;
$y = $x(0:5);

EOE
print $y;

Instead say:

use PDL::NiceSlice;
$x = sequence 10;
eval << 'EOE';

$y = $x(0:5);

EOE
print $y;

Source filters must be executed at compile time to be effective. And PDL::NiceFilter is just a source filter (although it is not necessarily obvious for the casual user).

The new slicing syntax

Using PDL::NiceSlice slicing ndarrays becomes so much easier since, first of all, you don't need to make explicit method calls. No

$pdl->slice(....);

calls, etc. Instead, PDL::NiceSlice introduces two ways in which to slice ndarrays without too much typing:

  • using parentheses directly following a scalar variable name, for example

    $c = $y(0:-3:4,(0));
  • using the so called default method invocation in which the ndarray object is treated as if it were a reference to a subroutine (see also perlref). Take this example that slices an ndarray that is part of a perl list @b:

    $c = $b[0]->(0:-3:4,(0));

The format of the argument list is the same for both types of invocation and will be explained in more detail below.

Parentheses following a scalar variable name

An arglist in parentheses following directly after a scalar variable name that is not preceded by & will be resolved as a slicing command, e.g.

$x(1:4) .= 2;         # only use this syntax on ndarrays
$sum += $x(,(1));

However, if the variable name is immediately preceded by a &, for example

&$x(4,5);

it will not be interpreted as a slicing expression. Rather, to avoid interfering with the current subref syntax, it will be treated as an invocation of the code reference $x with argumentlist (4,5).

The $x(ARGS) syntax collides in a minor way with the perl syntax. In particular, ``foreach $var(LIST)'' appears like a PDL slicing call. NiceSlice avoids translating the ``for $var(LIST)'' and ``foreach $var(LIST)'' constructs for this reason. Since you can't use just any old lvalue expression in the 'foreach' 'for' constructs -- only a real perl scalar will do -- there's no functionality lost. If later versions of perl accept ``foreach <lvalue-expr> (LIST)'', then you can use the code ref syntax, below, to get what you want.

The default method syntax

The second syntax that will be recognized is what I called the default method syntax. It is the method arrow -> directly followed by an open parenthesis, e.g.

$x->transpose->(($pos)) .= 0;

Note that this conflicts with the use of normal code references, since you can write in plain Perl

$sub = sub { print join ',', @_ };
$sub->(1,'a');

NOTE: Once use PDL::NiceSlice is in effect (you can always switch it off with a line no PDL::NiceSlice; anywhere in the script) the source filter will incorrectly replace the above call to $sub with an invocation of the slicing method. This is one of the pitfalls of using a source filter that doesn't know anything about the runtime type of a variable (cf. the Implementation section).

This shouldn't be a major problem in practice; a simple workaround is to use the &-way of calling subrefs, e.g.:

$sub = sub { print join ',', @_ };
&$sub(1,'a');

When to use which syntax?

Why are there two different ways to invoke slicing? The first syntax $x(args) doesn't work with chained method calls. E.g.

$x->xchg(0,1)(0);

won't work. It can only be used directly following a valid perl variable name. Instead, use the default method syntax in such cases:

$x->transpose->(0);

Similarly, if you have a list of ndarrays @pdls:

$y = $pdls[5]->(0:-1);

The argument list

The argument list is a comma separated list. Each argument specifies how the corresponding dimension in the ndarray is sliced. In contrast to usage of the "slice" in PDL::Slices method the arguments should not be quoted. Rather freely mix literals (1,3,etc), perl variables and function invocations, e.g.

$x($pos-1:$end,myfunc(1,3)) .= 5;

There can even be other slicing commands in the arglist:

$x(0:-1:$pdl($step)) *= 2;

NOTE: If you use function calls in the arglist make sure that you use parentheses around their argument lists. Otherwise the source filter will get confused since it splits the argument list on commas that are not protected by parentheses. Take the following example:

 sub myfunc { return 5*$_[0]+$_[1] }
 $x = sequence 10;
 $sl = $x(0:myfunc 1, 2);
 print $sl;
PDL barfed: Error in slice:Too many dims in slice
Caught at file /usr/local/bin/perldl, line 232, pkg main

The simple fix is

 $sl = $x(0:myfunc(1, 2));
 print $sl;
[0 1 2 3 4 5 6 7]

Note that using prototypes in the definition of myfunc does not help. At this stage the source filter is simply not intelligent enough to make use of this information. So beware of this subtlety.

Another pitfall to be aware of: currently, you can't use the conditional operator in slice expressions (i.e., ?:, since the parser confuses them with ranges). For example, the following will cause an error:

 $x = sequence 10;
 $y = rand > 0.5 ? 0 : 1; # this one is ok
 print $x($y ? 1 : 2);    # error !
syntax error at (eval 59) line 3, near "1,

For the moment, just try to stay clear of the conditional operator in slice expressions (or provide us with a patch to the parser to resolve this issue ;).

Modifiers

Following a suggestion originally put forward by Karl Glazebrook the latest versions of PDL::NiceSlice implement modifiers in slice expressions. Modifiers are convenient shorthands for common variations on PDL slicing. The general syntax is

$pdl(<slice>;<modifier>)

Four modifiers are currently implemented:

  • _ : flatten the ndarray before applying the slice expression. Here is an example

      $y = sequence 3, 3;
      print $y(0:-2;_); # same as $y->flat->(0:-2)
    [0 1 2 3 4 5 6 7]

    which is quite different from the same slice expression without the modifier

      print $y(0:-2);
    [
     [0 1]
     [3 4]
     [6 7]
    ]
  • | : sever the link to the ndarray, e.g.

      $x = sequence 10;
      $y = $x(0:2;|)++;  # same as $x(0:2)->sever++
      print $y;
    [1 2 3]
      print $x; # check if $x has been modified
    [0 1 2 3 4 5 6 7 8 9]
  • ? : short hand to indicate that this is really a where expression

    As expressions like

    $x->where($x>5)

    are used very often you can write that shorter as

    $x($x>5;?)

    With the ?-modifier the expression preceding the modifier is not really a slice expression (e.g. ranges are not allowed) but rather an expression as required by the where method. For example, the following code will raise an error:

     $x = sequence 10;
     print $x(0:3;?);
    syntax error at (eval 70) line 3, near "0:"

    That's about all there is to know about this one.

  • - : squeeze out any singleton dimensions. In less technical terms: reduce the number of dimensions (potentially) by deleting all dims of size 1. It is equivalent to doing a reshape(-1). That can be very handy if you want to simplify the results of slicing operations:

     $x = ones 3, 4, 5;
     $y = $x(1,0;-); # easier to type than $x((1),(0))
     print $y->info;
    PDL: Double D [5]

    It also provides a unique opportunity to have smileys in your code! Yes, PDL gives new meaning to smileys.

Combining modifiers

Several modifiers can be used in the same expression, e.g.

$c = $x(0;-|); # squeeze and sever

Other combinations are just as useful, e.g. ;_| to flatten and sever. The sequence in which modifiers are specified is not important.

A notable exception is the where modifier (?) which must not be combined with other flags (let me know if you see a good reason to relax this rule).

Repeating any modifier will raise an error:

 $c = $x(-1:1;|-|); # will cause error
NiceSlice error: modifier | used twice or more

Modifiers are still a new and experimental feature of PDL::NiceSlice. I am not sure how many of you are actively using them. Please do so and experiment with the syntax. I think modifiers are very useful and make life a lot easier. Feedback is welcome as usual. The modifier syntax will likely be further tuned in the future but we will attempt to ensure backwards compatibility whenever possible.

Argument formats

In slice expressions you can use ranges and secondly, ndarrays as 1D index lists (although compare the description of the ?-modifier above for an exception).

  • ranges

    You can access ranges using the usual : separated format:

    $x($start:$stop:$step) *= 4;

    Note that you can omit the trailing step which then defaults to 1. Double colons (::) are not allowed to avoid clashes with Perl's namespace syntax. So if you want to use steps different from the default you have to also at least specify the stop position. Examples:

    $x(::2);   # this won't work (in the way you probably intended)
    $x(:-1:2); # this will select every 2nd element in the 1st dim

    Just as with "slice" in PDL::Slices negative indices count from the end of the dimension backwards with -1 being the last element. If the start index is larger than the stop index the resulting ndarray will have the elements in reverse order between these limits:

     print $x(-2:0:2);
    [8 6 4 2 0]

    A single index just selects the given index in the slice

     print $x(5);
    [5]

    Note, however, that the corresponding dimension is not removed from the resulting ndarray but rather reduced to size 1:

     print $x(5)->info
    PDL: Double D [1]

    If you want to get completely rid of that dimension enclose the index in parentheses (again similar to the "slice" in PDL::Slices syntax):

     print $x((5));
    5

    In this particular example a 0D ndarray results. Note that this syntax is only allowed with a single index. All these will be errors:

    print $x((0,4));  # will work but not in the intended way
    print $x((0:4));  # compile time error

    An empty argument selects the whole dimension, in this example all of the first dimension:

    print $x(,(0));

    Alternative ways to select a whole dimension are

    $x = sequence 5, 5; 
    print $x(:,(0));
    print $x(0:-1,(0));
    print $x(:-1,(0));
    print $x(0:,(0));

    Arguments for trailing dimensions can be omitted. In that case these dimensions will be fully kept in the sliced ndarray:

     $x = random 3,4,5;
     print $x->info;
    PDL: Double D [3,4,5]
     print $x((0))->info;
    PDL: Double D [4,5]
     print $x((0),:,:)->info;  # a more explicit way
    PDL: Double D [4,5]
     print $x((0),,)->info;    # similar
    PDL: Double D [4,5]
  • dummy dimensions

    As in "slice" in PDL::Slices, you can insert a dummy dimension by preceding a single index argument with '*'. A lone '*' inserts a dummy dimension of order 1; a '*' followed by a number inserts a dummy dimension of that order.

  • ndarray index lists

    The second way to select indices from a dimension is via 1D ndarrays of indices. A simple example:

    $x = random 10;
    $idx = long 3,4,7,0;
    $y = $x($idx);

    This way of selecting indices was previously only possible using "dice" in PDL::Slices (PDL::NiceSlice attempts to unify the slice and dice interfaces). Note that the indexing ndarrays must be 1D or 0D. Higher dimensional ndarrays as indices will raise an error:

     $x = sequence 5, 5;
     $idx2 = ones 2,2;
     $sum = $x($idx2)->sum;
    ndarray must be <= 1D at /home/XXXX/.perldlrc line 93

    Note that using index ndarrays is not as efficient as using ranges. If you can represent the indices you want to select using a range use that rather than an equivalent index ndarray. In particular, memory requirements are increased with index ndarrays (and execution time may be longer). That said, if an index ndarray is the way to go use it!

As you might have expected ranges and index ndarrays can be freely mixed in slicing expressions:

$x = random 5, 5;
$y = $x(-1:2,pdl(3,0,1));

ndarrays as indices in ranges

You can use ndarrays to specify indices in ranges. No need to turn them into proper perl scalars with the new slicing syntax. However, make sure they contain not more than one element! Otherwise a runtime error will be triggered. First a couple of examples that illustrate proper usage:

 $x = sequence 5, 5;
 $rg = pdl(1,-1,3);
 print $x($rg(0):$rg(1):$rg(2),2);
[
 [11 14]
]
 print $x($rg+1,:$rg(0));
[
 [2 0 4]
 [7 5 9]
]

The next one raises an error

 print $x($rg+1,:$rg(0:1));
multielement ndarray where only one allowed at XXX/Core.pm line 1170.

The problem is caused by using the 2-element ndarray $rg(0:1) as the stop index in the second argument :$rg(0:1) that is interpreted as a range by PDL::NiceSlice. You can use multielement ndarrays as index ndarrays as described above but not in ranges. And PDL::NiceSlice treats any expression with unprotected :'s as a range. Unprotected means as usual "not occurring between matched parentheses".

IMPLEMENTATION

PDL::NiceSlice exploits the ability of Perl to use source filtering (see also perlfilter). A source filter basically filters (or rewrites) your perl code before it is seen by the compiler. PDL::NiceSlice searches through your Perl source code and when it finds the new slicing syntax it rewrites the argument list appropriately and splices a call to the slice method using the modified arg list into your perl code. You can see how this works in the perldl or pdl2 shells by switching on reporting (see above how to do that).

BUGS

Conditional operator

The conditional operator can't be used in slice expressions (see above).

The DATA file handle

Note: To avoid clobbering the DATA filehandle PDL::NiceSlice switches itself off when encountering the __END__ or __DATA__ tokens. This should not be a problem for you unless you use SelfLoader to load PDL code including the new slicing from that section. It is even desirable when working with Inline::Pdlpp, see below.

Possible interaction with Inline::Pdlpp

There is currently an undesired interaction between PDL::NiceSlice and the new Inline::Pdlpp module (currently only in PDL CVS). Since PP code generally contains expressions of the type $var() (to access ndarrays, etc) PDL::NiceSlice recognizes those incorrectly as slice expressions and does its substitutions. This is not a problem if you use the DATA section for your Pdlpp code -- the recommended place for Inline code anyway. In that case PDL::NiceSlice will have switched itself off before encountering any Pdlpp code (see above):

  # use with Inline modules
use PDL;
use PDL::NiceSlice;
use Inline Pdlpp;

$x = sequence(10);
print $x(0:5);

__END__

__Pdlpp__

... inline stuff

Otherwise switch PDL::NiceSlice explicitly off around the Inline::Pdlpp code:

use PDL::NiceSlice;

$x = sequence 10;
$x(0:3)++;
$x->inc;

no PDL::NiceSlice; # switch off before Pdlpp code
use Inline Pdlpp => "Pdlpp source code";

The cleaner solution is to always stick with the DATA way of including your Inline code as in the first example. That way you keep your nice Perl code at the top and all the ugly Pdlpp stuff etc at the bottom.

Bug reports

Feedback and bug reports are welcome. Please include an example that demonstrates the problem. Log bug reports in the PDL issues tracker at https://github.com/PDLPorters/pdl/issues or send them to the pdl-devel mailing list (see http://pdl.perl.org/?page=mailing-lists).

COPYRIGHT

Copyright (c) 2001, 2002 Christian Soeller. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the same terms as PDL itself (see http://pdl.perl.org).