The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

PDL::CCS::MatrixOps - Low-level matrix operations for compressed storage sparse PDLs

SYNOPSIS

 use PDL;
 use PDL::CCS::MatrixOps;

 ##---------------------------------------------------------------------
 ## ... stuff happens

FUNCTIONS

ccs_matmult2d_sdd

  Signature: (
    indx ixa(NdimsA,NnzA); nza(NnzA); missinga();
    b(O,M);
    zc(O);
    [o]c(O,N)
    )

Two-dimensional matrix multiplication of a sparse index-encoded PDL $a() with a dense pdl $b(), with output to a dense pdl $c().

The sparse input PDL $a() should be passed here with 0th dimension "M" and 1st dimension "N", just as for the built-in PDL::Primitive::matmult().

"Missing" values in $a() are treated as $missinga(), which shouldn't be BAD or infinite, but otherwise ought to be handled correctly. The input pdl $zc() is used to pass the cached contribution of a $missinga()-row ("M") to an output column ("O"), i.e.

 $zc = ((zeroes($M,1)+$missinga) x $b)->flat;

$SIZE(Ndimsa) is assumed to be 2.

ccs_matmult2d_sdd does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

ccs_matmult2d_zdd

  Signature: (
    indx ixa(Ndimsa,NnzA); nza(NnzA);
    b(O,M);
    [o]c(O,N)
    )

Two-dimensional matrix multiplication of a sparse index-encoded PDL $a() with a dense pdl $b(), with output to a dense pdl $c().

The sparse input PDL $a() should be passed here with 0th dimension "M" and 1st dimension "N", just as for the built-in PDL::Primitive::matmult().

"Missing" values in $a() are treated as zero. $SIZE(Ndimsa) is assumed to be 2.

ccs_matmult2d_zdd does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

ccs_vcos_zdd

  Signature: (
    indx ixa(Two,NnzA); nza(NnzA);
    b(N);
    float+ [o]vcos(M);
    float+ [t]anorm(M);; int sizeM=>M)

Computes the vector cosine similarity of a dense row-vector $b(N) with respect to each column $a(i,*) of a sparse index-encoded PDL $a() of logical dimensions (M,N), with output to a dense piddle $vcos(M). "Missing" values in $a() are treated as zero, and $SIZE(Two) must be 2. This is basically the same thing as:

 ($a * $b->slice("*1,"))->xchg(0,1)->sumover / ($a->pow(2)->xchg(0,1)->sumover->sqrt * $b->pow(2)->sumover->sqrt)

... but should be must faster to compute. Output values in $vcos() are cosine similarities in the range [-1,1], except for zero-magnitude vectors which will result in NaN values in $vcos(). If you need non-negative distances, follow this up with a:

 $vcos->minus(1,$vcos,1)
 $vcos->inplace->setnantobad->inplace->setbadtoval(0); ##-- minimum distance for NaN values

to get distances values in the range [0,2]. You can use PDL threading to batch-compute distances for multiple $b() vectors simultaneously:

  $bx   = random($N, $NB);                   ##-- get $NB random vectors of size $N
  $vcos = ccs_vcos_zdd($ixa,$nza, $bx, $M);  ##-- $vcos is now ($M,$NB)

ccs_vcos_zdd() will set the bad status flag on the output piddle $vcos() if it is set on either of the input piddles $nza() or $b().

ACKNOWLEDGEMENTS

Perl by Larry Wall.

PDL by Karl Glazebrook, Tuomas J. Lukka, Christian Soeller, and others.

KNOWN BUGS

We should really implement matrix multiplication in terms of inner product, and have a good sparse-matrix only implementation of the former.

AUTHOR

Bryan Jurish <moocow@cpan.org>

All other parts Copyright (C) 2009-2015, Bryan Jurish. All rights reserved.

This package is free software, and entirely without warranty. You may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

perl(1), PDL(3perl)