The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Frame - data frame implementation

VERSION

version 0.003

SYNOPSIS

    use Data::Frame;
    use PDL;

    my $df = Data::Frame->new( columns => [
        z => pdl(1, 2, 3, 4),
        y => ( sequence(4) >= 2 ) ,
        x => [ qw/foo bar baz quux/ ],
    ] );

    say $df;
    # ---------------
    #     z  y  x
    # ---------------
    #  0  1  0  foo
    #  1  2  0  bar
    #  2  3  1  baz
    #  3  4  1  quux
    # ---------------

    say $df->nth_column(0);
    # [1 2 3 4]

    say $df->select_rows( 3,1 )
    # ---------------
    #     z  y  x
    # ---------------
    #  3  4  1  quux
    #  1  2  0  bar
    # ---------------

DESCRIPTION

This implements a data frame container that uses PDL for individual columns. As such, it supports marking missing values (BAD values).

The API is currently experimental and is made to work with Statistics::NiceR, so be aware that it could change.

METHODS

new

    new( Hash %options ) # returns Data::Frame

Creates a new Data::Frame when passed the following options as a specification of the columns to add:

  • columns => ArrayRef $columns_array

    When columns is passed an ArrayRef of pairs of the form

        $columns_array = [
            column_name_z => $column_01_data, # first column data
            column_name_y => $column_02_data, # second column data
            column_name_x => $column_03_data, # third column data
        ]

    then the column data is added to the data frame in the order that the pairs appear in the ArrayRef.

  • columns => HashRef $columns_hash

        $columns_hash = {
            column_name_z => $column_03_data, # third column data
            column_name_y => $column_02_data, # second column data
            column_name_x => $column_01_data, # first column data
        }

    then the column data is added to the data frame by the order of the keys in the HashRef (sorted with a stringwise cmp).

string

    string() # returns Str

Returns a string representation of the Data::Frame.

number_of_columns

    number_of_columns() # returns Int

Returns the count of the number of columns in the Data::Frame.

number_of_rows

    number_of_rows() # returns Int

Returns the count of the number of rows in the Data::Frame.

nth_columm

    number_of_rows(Int $n) # returns a column

Returns column number $n. Supports negative indices (e.g., $n = -1 returns the last column).

column_names

    column_names() # returns an ArrayRef

    column_names( @new_column_names ) # returns an ArrayRef

Returns an ArrayRef of the names of the columns.

If passed a list of arguments @new_column_names, then the columns will be renamed to the elements of @new_column_names. The length of the argument must match the number of columns in the Data::Frame.

row_names

    row_names() # returns a PDL

    row_names( Array @new_row_names ) # returns a PDL

    row_names( ArrayRef $new_row_names ) # returns a PDL

    row_names( PDL $new_row_names ) # returns a PDL

Returns an ArrayRef of the names of the columns.

If passed a argument, then the rows will be renamed. The length of the argument must match the number of rows in the Data::Frame.

column

    column( Str $column_name )

Returns the column with the name $column_name.

add_columns

    add_columns( Array @column_pairlist )

Adds all the columns in @column_pairlist to the Data::Frame.

add_column

    add_column(Str $name, $data)

Adds a single column to the Data::Frame with the name $name and data $data.

select_rows

    select_rows( Array @which )

    select_rows( ArrayRef $which )

    select_rows( PDL $which )

The argument $which is a vector of indices. select_rows returns a new Data::Frame that contains rows that match the indices in the vector $which.

This Data::Frame supports PDL's data flow, meaning that changes to the values in the child data frame columns will appear in the parent data frame.

If no indices are given, a Data::Frame with no rows is returned.

SEE ALSO

AUTHOR

Zakariyya Mughal <zmughal@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Zakariyya Mughal.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.