The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Microarray::GEO::SOFT::GDS - GEO data set data class

SYNOPSIS

  use Microarray::GEO::SOFT:
  my $soft = Microarray::GEO::SOFT->new;
  $soft->download("GDS3719");
  
  my $gds = $soft->parse;
  
  # the meta information
  $gds->meta;
  $gds->platform;
  $gds->title;
  $gds->accession;
  
  # the sample data is a matrix
  $gds->matrix;
  # the names for each column
  $gds->colnames;
  $ the names for each row, it is the primary id for rows
  $gds->rownames;

DESCRIPTION

A DataSet represents a curated collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a DataSet refer to the same Platform, that is, they share a common set of array elements. Value measurements for each Sample within a DataSet are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the DataSet. Information reflecting experimental factors is provided through DataSet subsets. (Copyed from GEO web site).

This module retrieves data storing as GEO data set format. We take this as the basic microarray data format (expression matrix).

Subroutines

new("file" => $file, "use_identifier" => 0, "verbose" = 1)>

Initial a GDS class object. The first argument is the path of the microarray data in SOFT format or a file handle that has been openned. The argument is optional and the platform can be download through Microarray::GEO::SOFT. Since gene identifiers have been integrated into the SOFT file, so user can shoose whether to take probe ID or identifiers as the primary ID. We do not accommendate to set 'use_identifier' to TURE becaure 'id_convert' will not work if set the value to TURE. 'verbose' determines whether print the message when analysis. 'sample_value_column' is the column name for table data when parsing GSM data.

$gds->parse

Retrieve data set information from microarray data. The data set in SOFT format is alawys a table

$gds->meta

Get meta information

$gds->set_meta(HASH)

Set meta information. Valid argumetns are 'accession', 'title' and 'platform'.

$gds->table

Get table information

$gds->set_table

Set table information. Valid argumetns are 'rownames', 'colnames', 'colname_explain' and 'matrix'.

$gds->platform

Accession number for the platform the data set belong to.

$gds->title

Title of the data set record

$gds->accession

Accession number for the data set

$gds->rownames

primary ID for probes

$gds->colnames

Different sample names or experiment designs

$gds->colnames_explain

A little more detailed explain for column names

$gds->matrix

expression value matrix

$gds->id_convert($gpl, $to_id)

Transfrom the primary ID to a new ID type. The first argument is a Microarray::GEO::SOFT::GPL class object that the GDS belongs to. The second argument is the ID that would map to. It is one of the colnames of $gpl. Also a regexp is accepted. It returns a Microarray::ExprSet object.

$gds->soft2exprset

Transform Microarray::GEO::SOFT::GDS class object to Microarray::ExprSet class object.

$gds->get_subset(HASH)

Get subset of rows and columns in the expression matrix. Valid arguments are 'byrow' and 'bycol'. the value for these two arguments should be array reference where the length should be equal to the length of rownames or colnames of the matrix respectively. The value in the array should be either TRUE(1) or FALSE(0) to indicate whether take or drop the corresponding position in the matrix. It returns a Microarray::ExprSet object.

AUTHOR

Zuguang Gu <jokergoo@gmail.com>

COPYRIGHT AND LICENSE

Copyright 2012 by Zuguang Gu

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

Microarray::GEO::SOFT