The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::DB::Das::Chado::Segment - DAS-style access to a chado database

SYNOPSIS

  # Get a Bio::Das::SegmentI object from a Bio::DB::Das::Chado database...

  $segment = $das->segment(-name => 'Landmark',
                           -start=> $start,
                           -stop => $stop);

  @features = $segment->overlapping_features(-type=>['type1','type2']);
  # each feature is a Bio::SeqFeatureI-compliant object

  @features = $segment->contained_features(-type=>['type1','type2']);

  @features = $segment->contained_in(-type=>['type1','type2']);

  $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'];
  while (my $feature = $stream->next_seq) {
     # do something with feature
  }

  $count = $segment->features_callback(-type=>['type1','type2','type3'],
                                       -callback => sub { ... { }
                                       );

DESCRIPTION

Bio::DB::Das::Chado::Segment is a simplified alternative interface to sequence annotation databases used by the distributed annotation system. In this scheme, the genome is represented as a series of landmarks. Each Bio::DB::Das::Chado::Segment object ("segment") corresponds to a genomic region defined by a landmark and a start and end position relative to that landmark. A segment is created using the Bio::DasI segment() method.

Features can be filtered by the following attributes:

  1) their location relative to the segment (whether overlapping,
          contained within, or completely containing)

  2) their type

  3) other attributes using tag/value semantics

Access to the feature list uses three distinct APIs:

  1) fetching entire list of features at a time

  2) fetching an iterator across features

  3) a callback

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

  bioperl-l@bio.perl.org

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:

  bioperl-bugs@bio.perl.org
  http://bio.perl.org/bioperl-bugs/

AUTHOR - Scott Cain

Email cain@cshl.org

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

name

 Title   : name
 Usage   : $segname = $seg->name();
 Function: Returns the name of the segment
 Returns : see above
 Args    : none
 Status  : public

feature_id()

  Title   : feature_id
  Usage   : $obj->feature_id($newval)
  Function: holds feature.feature_id
  Returns : value of feature_id (a scalar)
  Args    : on set, new value (a scalar or undef, optional)

strand()

  Title   : strand
  Usage   : $obj->strand()
  Function: Returns the strand of the feature.  Unlike the other
            methods, the strand cannot be changed once the object is
            created (due to coordinate considerations).
            corresponds to featureloc.strand
  Returns : -1, 0, or 1
  Args    : on set, new value (a scalar or undef, optional)

attributes

 Title   : attributes 
 Usage   : @attributes = $obj->attributes;
 Function: get the "attributes" of this segment
 Returns : An array of strings
 Args    : None

This is a object-specific wrapper on the more generic attributes method in Bio::DB::Das::Chado.

_search_by_name

 Title   : _search_by_name 
 Usage   : _search_by_name($name);
 Function: Searches for segments based on a name
 Returns : Either a scalar (a feature_id) or an arrary ref (containing feature_ids)
 Args    : A string (name)
 Status  : private (used by new)

class

  Title   : class
  Usage   : $obj->class($newval)
  Function: Returns the segment class (synonymous with type)
  Returns : value of class (a scalar)
  Args    : on set, new value (a scalar or undef, optional)

type

  Title   : type
  Usage   : $obj->type($newval)
  Function: used to be alias of class() for backward compatibility,
            now behaves the same as Bio::DB::Das::Chado::Segment::Feature->type
  Returns : A Bio::DB::GFF::Typename object
  Args    : on set, new value: Bio::DB::GFF::Typename object

seq_id

 Title   : seq_id
 Usage   : $ref = $s->seq_id
 Function: return the ID of the landmark, aliased to name() for backward compatibility
 Returns : a string
 Args    : none
 Status  : Public

start

 Title   : start
 Usage   : $s->start
 Function: start of segment
 Returns : integer
 Args    : none
 Status  : Public

low

 Title   : low
 Usage   : $s->low
 Function: start of segment
 Returns : integer
 Args    : none
 Status  : Public

Alias of start for backward compatibility

end

 Title   : end
 Usage   : $s->end
 Function: end of segment
 Returns : integer
 Args    : none
 Status  : Public

high

 Title   : high
 Usage   : $s->high
 Function: end of segment
 Returns : integer
 Args    : none
 Status  : Public

Alias of end for backward compatiblity

stop

 Title   : stop
 Usage   : $s->stop
 Function: end of segment
 Returns : integer
 Args    : none
 Status  : Public

Alias of end for backward compatiblity

length

 Title   : length
 Usage   : $s->length
 Function: length of segment
 Returns : integer
 Args    : none
 Status  : Public

Returns the length of the segment. Always a positive number.

features

 Title   : features
 Usage   : @features = $s->features(@args)
 Function: get features that overlap this segment
 Returns : a list of Bio::SeqFeatureI objects
 Args    : see below
 Status  : Public

This method will find all features that intersect the segment in a variety of ways and return a list of Bio::SeqFeatureI objects. The feature locations will use coordinates relative to the reference sequence in effect at the time that features() was called.

The returned list can be limited to certain types, attributes or range intersection modes. Types of range intersection are one of:

   "overlaps"      the default
   "contains"      return features completely contained within the segment
   "contained_in"  return features that completely contain the segment

Two types of argument lists are accepted. In the positional argument form, the arguments are treated as a list of feature types. In the named parameter form, the arguments are a series of -name=>value pairs.

  Argument    Description
  --------   ------------

  -types      An array reference to type names in the format
              "method:source"

  -attributes A hashref containing a set of attributes to match

  -rangetype  One of "overlaps", "contains", or "contained_in".

  -iterator   Return an iterator across the features.

  -callback   A callback to invoke on each feature

The -attributes argument is a hashref containing one or more attributes to match against:

  -attributes => { Gene => 'abc-1',
                   Note => 'confirmed' }

Attribute matching is simple string matching, and multiple attributes are ANDed together. More complex filtering can be performed using the -callback option (see below).

If -iterator is true, then the method returns an object reference that implements the next_seq() method. Each call to next_seq() returns a new Bio::SeqFeatureI object.

If -callback is passed a code reference, the code reference will be invoked on each feature returned. The code will be passed two arguments consisting of the current feature and the segment object itself, and must return a true value. If the code returns a false value, feature retrieval will be aborted.

-callback and -iterator are mutually exclusive options. If -iterator is defined, then -callback is ignored.

_features2level

  See: features

Its a crude copy past from feature + additionnal code to handle prefetching of 2 levels features. The generated query is ~ as performant as the one generated by features, and the calls to Bio::DB::Das::Chado::Segment->sub_SeqFeatures are avoided, but this doesn't lead to a huge performace boost.

If a further development increases the performances provided by this 2 level prefetch, we will need to refactor features and _features2level to avoid code duplication

get_all_SeqFeature, get_SeqFeatures, top_SeqFeatures, all_SeqFeatures

 Title   : get_all_SeqFeature, get_SeqFeatures, top_SeqFeatures, all_SeqFeatures
 Usage   : $s->get_all_SeqFeature()
 Function: get the sequence string for this segment
 Returns : a string
 Args    : none
 Status  : Public

Several aliases of features() for backward compatibility

dna

 Title   : dna
 Usage   : $s->dna
 Function: get the dna string this segment
 Returns : a string
 Args    : none
 Status  : Public

Returns the sequence for this segment as a string.

seq

 Title   : seq
 Usage   : $s->seq
 Function: get a Bio::Seq object for this segment
 Returns : a Bio::Seq object
 Args    : none
 Status  : Public

Returns the sequence for this segment as a Bio::Seq object.

factory

 Title   : factory
 Usage   : $factory = $s->factory
 Function: return the segment factory
 Returns : a Bio::DasI object
 Args    : see below
 Status  : Public

This method returns a Bio::DasI object that can be used to fetch more segments. This is typically the Bio::DasI object from which the segment was originally generated.

srcfeature_id

  Title   : srcfeature_id
  Usage   : $obj->srcfeature_id($newval)
  Function: undocumented method by Scott Cain
  Returns : value of srcfeature_id (a scalar)
  Args    : on set, new value (a scalar or undef, optional)

source

  Title   : source
  Usage   : $obj->source($newval)
  Function: Returns the source; sets with an argument
  Returns : A string that is the source
  Args    : A string to set the source

source_tag

  Title   : source_tag
  Function: aliased to source() for Bio::SeqFeatureI compatibility

alphabet

  Title   : alphabet
  Usage   : $obj->alphabet($newval)
  Function: Returns the sequence "type", ie, dna
  Returns : scalar 'dna'
  Args    : None

display_id, display_name, accession_number, desc

  Title   : display_id, display_name, accession_number, desc
  Usage   : $s->display_name()
  Function: Alias of name()
  Returns : string
  Args    : none

Several aliases for name; it may be that these could do something better than just giving back the name.

get_feature_stream

  Title   : get_feature_stream
  Usage   : $db->get_feature_stream(@args)
  Function: creates a feature iterator
  Returns : A Bio::DB::Das::ChadoIterator object
  Args    : The same arguments as the feature method

get_feature_stream has an alias called get_seq_stream for backward compatability.

clone

 Title   : clone
 Usage   : $copy = $s->clone
 Function: make a copy of this segment
 Returns : a Bio::DB::GFF::Segment object
 Args    : none
 Status  : Public

sourceseq

  Title   : sourceseq
  Usage   : $obj->sourceseq($newval)
  Function: undocumented method by Scott Cain
  Returns : value of sourceseq (a scalar)
  Args    : on set, new value (a scalar or undef, optional)

refseq

 Title   : refseq
 Usage   : $s->refseq
 Function: get or set the reference sequence
 Returns : a string
 Args    : none
 Status  : Public

Examine or change the reference sequence. This is an alias to sourceseq(), provided here for API compatibility with Bio::DB::GFF::RelSegment.

abs_ref

  Title   : abs_ref
  Usage   : $obj->abs_ref()
  Function: Alias of sourceseq
  Returns : value of sourceseq (a scalar)
  Args    : none

Alias of sourceseq for backward compatibility

abs_start

  Title   : abs_start
  Usage   : $obj->abs_start()
  Function: Alias of start
  Returns : value of start (a scalar)
  Args    : none

abs_end

  Title   : abs_end
  Usage   : $obj->abs_end()
  Function: Alias of end
  Returns : value of end (a scalar)
  Args    : none

asString

 Title   : asString
 Usage   : $s->asString
 Function: human-readable string for segment
 Returns : a string
 Args    : none
 Status  : Public

Returns a human-readable string representing this sequence. Format is:

   sourceseq:start,stop