The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::DB::Das::BioSQL::Segment - DAS-style access to a BioSQL database

SYNOPSIS

  # Get a Bio::Das::SegmentI object from a Bio::DB::Das::BioSQL database...

  #Should be created through Bio::DB::Das::BioSQL.

  @features = $segment->overlapping_features(-type=>['type1','type2']);
  # each feature is a Bio::SeqFeatureI-compliant object

  @features = $segment->contained_features(-type=>['type1','type2']);

  @features = $segment->contained_in(-type=>['type1','type2']);

  $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'];
  while (my $feature = $stream->next_seq) {
     # do something with feature
  }

DESCRIPTION

Bio::DB::Das::BioSQL::Segment is a simplified alternative interface to sequence annotation databases used by the distributed annotation system. In this scheme, the genome is represented as a series of landmarks. Each Bio::DB::Das::BioSQL::Segment object ("segment") corresponds to a genomic region defined by a landmark and a start and end position relative to that landmark. A segment is created using the Bio::DB::Das::BioSQL segment() method.

The segment will load its features only when the features() method is called. If start and end are not specified and features are requested, all the features for the current segment will be retrieved, which may be slow.

Segment can be created as relative or absolute. If it's absolute ,all locations are given beginning from segment's start, that is, they are between [1 .. (end-start)]. Otherwise, they are given relative to the true start of the segment, irregardless of the start value.

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

  bioperl-l@bio.perl.org

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:

  bioperl-bugs@bio.perl.org
  http://bio.perl.org/bioperl-bugs/

AUTHORS - Lincoln Stein, Vsevolod (Simon) Ilyushchenko

Email lstein@cshl.edu, simonf@cshl.edu

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

seq_id

 Title   : seq_id
 Usage   : $ref = $s->seq_id
 Function: return the ID of the landmark
 Returns : a string
 Args    : none
 Status  : Public

start

 Title   : start
 Usage   : $s->start
 Function: start of segment
 Returns : integer
 Args    : none
 Status  : Public

This is a read-only accessor for the start of the segment.

end

 Title   : end
 Usage   : $s->end
 Function: end of segment
 Returns : integer
 Args    : none
 Status  : Public

This is a read-only accessor for the end of the segment.

abs_start

 Title   : abs_start
 Usage   : $s->abs_start
 Function: start of segment
 Returns : integer
 Args    : none
 Status  : Public

Return the absolute start of the segment

abs_end

 Title   : abs_end
 Usage   : $s->abs_end
 Function: end of segment
 Returns : integer
 Args    : none
 Status  : Public

Return the absolute end of the segment

length

 Title   : length
 Usage   : $s->length
 Function: length of segment
 Returns : integer
 Args    : none
 Status  : Public

Returns the length of the segment. Always a positive number.

absolute

 Title   : absolute
 Usage   : $s->absolute
 Function: whether the positions are counted from the true start of the segment
            or from the start value
 Returns : boolean
 Args    : none
 Status  : Public

This is a read-only accessor.

features

 Title   : features
 Usage   : @features = $s->features(@args)
 Function: get features that overlap this segment
 Returns : a list of Bio::SeqFeatureI objects
 Args    : see below
 Status  : Public

This method will find all features that intersect the segment in a variety of ways and return a list of Bio::SeqFeatureI objects. The feature locations will use coordinates relative to the reference sequence in effect at the time that features() was called.

The returned list can be limited to certain types, attributes or range intersection modes. Types of range intersection are one of:

   "overlaps"      the default
   "contains"      return features completely contained within the segment
   "contained_in"  return features that completely contain the segment

Two types of argument lists are accepted. In the positional argument form, the arguments are treated as a list of feature types. In the named parameter form, the arguments are a series of -name=>value pairs.

  Argument    Description
  --------   ------------

  -types      An array reference to type names in the format
              "method:source"

  -attributes A hashref containing a set of attributes to match

  -rangetype  One of "overlaps", "contains", or "contained_in".

  -iterator   Return an iterator across the features.

  -callback   A callback to invoke on each feature

The -attributes argument is a hashref containing one or more attributes to match against:

  -attributes => { Gene => 'abc-1',
                   Note => 'confirmed' }

Attribute matching is simple string matching, and multiple attributes are ANDed together. More complex filtering can be performed using the -callback option (see below).

If -iterator is true, then the method returns an object reference that implements the next_seq() method. Each call to next_seq() returns a new Bio::SeqFeatureI object.

If -callback is passed a code reference, the code reference will be invoked on each feature returned. The code will be passed two arguments consisting of the current feature and the segment object itself, and must return a true value. If the code returns a false value, feature retrieval will be aborted.

-callback and -iterator are mutually exclusive options. If -iterator is defined, then -callback is ignored.

NOTE: In his implementation, -attributes does exactly nothing, and features() is wildly inefficient because it works by calling top_SeqFeatures and then filters by position in the Perl layer, rather than filtering by position in the SQL layer.

top_SeqFeatures

 Title   : top_SeqFeatures
 Usage   : $s->top_SeqFeatures
 Function: retrieve an array of features from the underlying BioDB object.
 Returns : an array
 Args    : none
 Status  : Private

First, make the adaptor retrieve the feature objects from the database. Then, get the actual objects and adjust the features' locations if necessary.

get_seq_stream

 Title   : get_seq_stream
 Usage   : my $seqio = $self->get_seq_stream(@args)
 Function: Performs a query and returns an iterator over it
 Returns : a Bio::SeqIO stream capable of returning Bio::Das::SegmentI objects
 Args    : As in features()
 Status  : public

This routine takes the same arguments as features(), but returns a Bio::SeqIO::Stream-compliant object. Use it like this:

  $stream = $db->get_seq_stream('exon');
  while (my $exon = $stream->next_seq) {
     print $exon,"\n";
  }

NOTE: In the interface this method is aliased to get_feature_stream(), as the name is more descriptive.

seq

 Title   : seq
 Usage   : $s->seq
 Function: get the sequence string for this segment
 Returns : a string
 Args    : none
 Status  : Public

Returns the sequence for this segment as a Bio::PrimarySeq object.

factory

 Title   : factory
 Usage   : $factory = $s->factory
 Function: return the segment factory
 Returns : a Bio::DasI object
 Args    : see below
 Status  : Public

This method returns a Bio::DasI object that can be used to fetch more segments. This is typically the Bio::DasI object from which the segment was originally generated.

bioseq

 Title   : bioseq
 Usage   : $bioseq = $s->bioseq
 Function: return the underlying Bio::Seq object
  Returns : a Bio::Seq object
 Args    : none
 Status  : Public

asString

 Title   : asString
 Usage   : $s->asString
 Function: human-readable representation of the segment
 Returns : a string
 Args    : none
 Status  : Public

This method will return a human-readable representation of the segment. It is the overloaded method call for the "" operator.

Currently the format is:

  refseq:start,stop