The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Das - Interface to Distributed Annotation System

SYNOPSIS

  use Bio::Das;

  # contact a DAS server using the "elegans" data source
  my $das      = Bio::Das->new('http://www.wormbase.org/db/das' => 'elegans');

  # fetch a segment
  my $segment  = $das->segment(-ref=>'CHROMOSOME_I',-start=>10_000,-stop=>20_000);

  # get features and DNA from segment
  my @features = $segment->features;
  my $dna      = $segment->dna;

  # find out what data sources are available:
  my $db       = Bio::Das->new('http://www.wormbase.org/db/das')
  my @sources  $db->sources;

  # select a source
  $db->dsn($sources[1]);

  # find out what feature types are available
  my @types       = $db->types;

  # get the stylesheet
  my $stylesheet  = $db->stylesheet;

  # get the entry points
  my @entry_poitns = $db->entry_points;

DESCRIPTION

Bio::Das provides access to genome sequencing and annotation databases that export their data in Distributed Annotation System (DAS) format. This system is described at http://biodas.org.

The components of the Bio::Das class hierarchy are:

Bio::Das

This class performs I/O with the DAS server, and is responsible for generating Bio::Das::Segment, Bio::Das::Stylesheet, and Bio::Das::Source objects.

Bio::Das::Segment

This class encapsulates information about a named segment of the genome. Segments are generated by Bio::Das, and in turn are responsible for generating Bio::Das::Segment::Feature objects. Bio::Das::Segment implements the Bio::RangeI interface.

Bio::Das::Segment::Feature

This is a subclass of Bio::Das::Segment, and provides information about an annotated genomic feature. In addition to implementing Bio::RangeI, this class implements the Bio::SeqFeatureI interface.

Bio::Das::Segment::GappedAlignment

This is a subclass of Bio::Das::Segment::Feature that adds a minimal set of methods appropriate for manipulating gapped alignments.

Bio::Das::Segment::Transcript

This is a subclass of Bio::Das::Segment::Feature that adds a minimal set of methods appropriate for manipulating mRNA transcript models.

Bio::Das::Stylesheet

This is a class that translates Bio::Das::Segment::Feature objects into suggested glyph names and arguments. It represents the remote DAS server's suggestions for how particular annotations should be represented visually.

Bio::Das::Source

This class contains descriptive information about a DAS data source (DSN).

Bio::Das::Parser

This is a base class used by the Bio::Das::* hierarchy that provides methods for parsing the XML used in DAS data transmission.

Bio::Das::Util

Internally-used utility functions.

OBJECT CREATION

The public Bio::Das constructor is new():

$das = Bio::Das->new($server_url [,$dsn])

Create a new Bio::Das object, associated with the URL given in $server_url. The server URL uses the format described in the specification at biodas.org, and consists of a site-specific prefix and the "/das" path name. For example:

 http://www.wormbase.org/db/das
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 site-specific prefix

The optional $dsn argument specifies a data source, for use by DAS servers that provide access to several annotation sets. A data source is a symbolic name, such as 'human_genes'. A list of such sources can be obtained from the server by using the sources() method. Once set, the data source can be examined or changed with the dsn() method.

OBJECT METHODS

Once created, the Bio::Das object provides the following methods:

@sources = $das->sources

Return a list of data sources available from this server. This is one of the few methods that can be called before setting the data source.

$segment = $das->segment($id)
$segment = $das->segment(-ref => $reference [,@args]);

The segment() method returns a new Bio::Das::Segment object, which can be queried for information related to a sequence segment. There are two forms of this call. In the single-argument form, you pass segment() an ID to be used as the reference sequence. Sequence IDs are server-specific (some servers will accept genbank accession numbers, others more complex IDs such as Locus:unc-9). The method will return a Bio::Das::Segment object containing a region of the genomic corresponding to the ID.

Instead of a segment ID, you may use a previously-created Bio::Das::Segment object, in which case a copy of the segment will be returned to you. You can then adjust its start and end positions.

In the multiple-argument form, you pass a series of argument/value pairs:

  Argument   Value                   Default
  --------   -----                   -------

  -ref       Reference ID            none
  -segment   Bio::Das::Segment obj   none
  -start     Starting position       1
  -stop      Ending position         length of ref ID
  -offset    Starting position       0
             (0-based)
  -length    Length of segment       length of ref ID

The -ref argument is required, and indicates the ID of the genomic segment to retrieve. -segment is optional, and can be used to use a previously-created Bio::Das::Segment object as the reference point instead. If both arguments are passed, -segment supersedes -ref.

-start and -end indicate the start and stop of the desired genomic segment, relative to the reference ID. If not provided, they default to the start and stop of the reference segment. These arguments use 1-based indexing, so a -start of 0 positions the segment one base before the start of the reference.

-offset and -length arguments are alternative ways to indicate a segment using zero-based indexing. It is probably not a good to mix the two calling styles, but if you do, be aware that -offset supersedes -start and -length supersedes -stop.

Note that no checking of the validity of the passed reference ID will be performed until you call the segment's features() or dna() methods.

@entry_points = $das->entry_points

The entry_points() method returns an array of Bio::Das::Segment objects that have been designated "entry points" by the DAS server. Also see the Bio::Das::Segment->entry_points() method.

$stylesheet = $das->stylesheet

Return the stylesheet from the remote DAS server. The stylesheet contains suggestions for the visual format for the various features provided by the server and can be used to translate features into glyphs. The object returned is a Bio::Das::Stylesheet object.

@types = $das->types

This method returns a list of all the annotation feature types served by the DAS server. The return value is an array of Bio::Das::Type objects.

ACCESSORS

A number of less-frequently used methods are accessors for the Bio::Das object, and can be used to examine and change its settings. Called with no arguments, the accessors return the current value of the setting. Called with a single argument, the accessors change the setting and return its previous value.

  Accessor         Description
  --------         -----------
  server()         Get/set the URL of the server
  error()          Get/set the last error message
  dsn()            Get/set the DSN of the data source
  source()         An alias for dsn()

INTERNAL METHODS

The methods in this section are published methods that are used internally. They may be useful for subclassing.

$agent = $das->agent

Return the LWP::UserAgent that will be used for communicating with the DAS server.

$url = $das->base

Return a URL resulting from combining the server URL with the DSN.

$request = $das->request($query_type [,@args])

Create a LWP::Request object for use in communicating with the DAS server. The $query_type argument is the type of the request, and may be one of "dsn", "entry_points", "dna", "resolve", "types", "features", "link", and "stylesheet".The optional @args array contains a series of name/value pairs to pass to the DAS server.

$url = $das->request_url($query_type)

Creates a URI::URL object corresponding to the indicated query type.

$data = $das->do_request($query_type [,@args][,-parser=>$parser] [,-chunk=>$chunksize]

This method invokes the DAS query indicated by $query_type using the arguments indicated by @args, and returns the resulting XML document. For example, to get the raw XML output from a DAS server using the dna request on the M7 clone segment from 1 to 30,000, you could call do_request() like this:

 $dna_xml = $das->do_request('dna',-ref=>'M7',-start=>1,-stop=>30000);

Query arguments correspond to the CGI parameters listed for each request in the DAS specification, with the exception that they are preceded by a hyphen.

You may provide a -parser argument, in which case the downloaded XML is passed to the indicated parser for interpretation. The -chunk argument controls the size of the chunks passed to the parser. Parsers must be objects the implement the interface described in Bio::Das::Parser.

AUTHOR

Lincoln Stein <lstein@cshl.org>.

Copyright (c) 2001 Cold Spring Harbor Laboratory

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty.

SEE ALSO

Bio::Das::Segment, Bio::Das::Type, Bio::Das::Stylesheet, Bio::Das::Source, Bio::RangeI