NAME

Bio::Gonzales::Matrix::IO - Library for simple matrix IO

SYNOPSIS

    use Bio::Gonzales::Matrix::IO qw(lspew mslurp lslurp mspew);

DESCRIPTION

Provides functions for common matrix/list IO.

SUBROUTINES

dict_slurp($filename, \%options)
  %options = (
    sep     => qr/\t/,
    header  => 0,
    skip    => -1,
    comment => qr/^\s*#/,
    key_idx => 0,
    val_idx => undef,
    uniq    => 0,
    record_filter => undef,
    concat_keys => 1,
    strict => 0
  );

Setups:

uniq = 1 && no val_idx => read in key_idx as hash and set values to 1
uniq = 0 && no val_idx => read in key_idx as hash and set values to the count of keys
uniq = 1 && val_idx => read into ( key => [ @values ], ...)
uniq = 0 && val_idx => read into ( key => [ [ @values ], [ @more_values ] ], ...)
concat_keys

Concatenate the keys by $;. If set to 0, key columns are taken in a serial fashion and are merged to one big column.

uniq = 1 && strict = 1 => confess if two times the same key occurs in the data.

If key_idx is an array, the keys columns are joined by $; to build the hash key.

mspew($filename, \@matrix, \%options)
mspew($filehandle, \@matrix, \%options)

Save the values in @matrix to a $filename or $filehandle. @matrix is an array of arrayrefs:

    @matrix = (
        [ l11, l12, l13 ],
        [ l21, l22, l23 ],
        [ l31, l32, l33 ]
    );

Options:

header / ids

Supply a header. Same as

     mspew($file, [ \@header, @matrix ])
row_names

Supply row names or if not an array but true, use the header as row names

    mspew( $file, $matrix, { row_names => 1 } );                            #use header
    mspew( $file, $matrix, { row_names => [ 'row1', '...', 'rown' ] } );    #use supplied row names
fill_missing_cols

If a row has less columns than the longest row of the matrix, fill it up with empty strings.

na_value

Use this value in case undefined values are found. Default is 'NA'.

sep

Set a separator for the output file

square (default 1)

Add empty columns to fill up to a square.

$matrix_ref = mslurp($file, \%config)
($matrix_ref, $header_ref, $row_names_ref) = mslurp($file, \%config)

Reads in the contents of $file and puts it in a array of arrayrefs.

You can set the delimiter via the configuration by supplying { sep => qr/\t/ } as config hash.

Further options with defaults:

    %config = (
        sep => qr/\t/, # set column separator
        header => 0, # parse header
        skip => 0, # skip the first N lines (without header)
        row_names => 0, # parse row names
        comment => qr/^\s*#/ # the comment character
        record_filter => undef # set a function to filter records
    );
    
lspew($fh_or_filename, $list, $config_options)

spews out a list of values to a file. It can handle filenames and filehandles, but if you supply a handle, you have to close it on your own. The $list can be a

hash ref of array refs

results in keya avalue0 avalue1 keyb bvalue0 bvalue1 ...

hash ref

results in keya valuea keyb valueb ...

array ref

results in value0 value1 ...

$config_options is a hash ref. It can take the options:

    $config_options = {
        delim => "\t",
    };

SEE ALSO

AUTHOR

jw bargsten, <joachim.bargsten at wur.nl>