From Code to Community: Sponsoring The Perl and Raku Conference 2025 Learn more

----
presentation_topic: dbic_chado
presentation_title: Bio::Chado::Schema
presentation_subtitle: a standard Perl ORM layer for Chado
presentation_place: San Diego, CA
presentation_date: January 13-14, 2010
----
= What is DBIx::Class?
* Object-relational mapping framework for Perl
* is now the de-facto standard
* nice features
----
= What is DBIx::Class?
* query building (the magic of chainable ResultSets)
* cross-database deployment (using |SQL::Translator| in the backend)
* lots and lots of plugins availabe (dates, testing, ...)
----
= Data Model
* a set of classes
* encapsulate the underlying storage
* providing a nicer, maintainable API for working with the data
This isn't really middleware, it's a code library.
----
= And therefore, Bio::Chado::Schema
== Approx. 180 Perl classes
* every table and view, every module
* Chado docs mirrored in the POD
+ ( So why do we need this? )
----
= Chado needs encapsulation
== Chado's design makes for:
* complex queries
* steep learning curve
* hard to get good performance
----
= Chado needs encapsulation
== complex queries
* generating
* storing
* automating
----
= Chado needs encapsulation
== steep learning curve
* codifying best practices
* unified, high-level documentation
* can help with database administration
----
= Chado needs encapsulation
== performance
* encapsulation => performance optimizations where needed
* first step for more data management
** creative indexing
** materialized views (with their own classes)
----
= BCS Usage
* open the schema. ( actual DB connection is deferred )
+ my $chado = Bio::Chado::Schema->connect( 'dbi:Pg:...', $user, $pass );
+* get a ResultSet object representing a set of Rows
+ my $all_features = $chado->resultset('Sequence::Feature');
+* a subset of those (another ResultSet)
+ my $other_feature = $all_features->search({ name => 'something' });
+* get an actual data object
+ my $some_feature = $all_features->find( 232432 );
$other_feature->first;
# or search in list context returns all resulting rows (careful!)
my @other_features = $all_features->search({ name => 'something' });
+* get its type cvterm
+ say $some_feature->type->name;
----
= BCS Usage: Joined Select
# get features via the potato organism, also joining in the cvterms table
my $potato_bacs =
$chado->resultset('Organism::Organism')
->search({ species => 'Solanum tuberosum' })
->search_related( 'features',
{ 'type.name' => 'BAC_clone'},
{ 'join' => 'type' },
);
----
= BCS Usage: Joined Select
# the equivalent bare SQL
my $potato_bacs = $dbh->selectall_arrayref( <<EOS, undef, 'Solanum tuberosum', 'BAC_clone');
SELECT features.feature_id
, features.dbxref_id
, features.organism_id
, features.name
, features.uniquename
, features.residues
, features.seqlen
, features.md5checksum
, features.type_id
, features.is_analysis
, features.is_obsolete
, features.timeaccessioned
, features.timelastmodified
FROM organism me
LEFT JOIN feature features
ON features.organism_id = me.organism_id
JOIN cvterm type
ON type.cvterm_id = features.type_id
WHERE type.name = 'BAC_clone' AND species = 'Solanum tuberosum'
EOS
----
= BCS Usage: Loading
$chado->resultset( 'Cv::Cv' )
->find_or_create({ name => 'My Fake Ontology' })
->create_related( 'cvterm',
{ name => 'MyFakeTerm' });
makes the SQL:
SELECT me.cv_id
, me.name
, me.definition
FROM cv me
WHERE ( me.name = 'my fake ontology' )
INSERT INTO cv ( name )
VALUES ( ? )
----
= BCS Usage: Transactions
$chado->txn_do(sub {
$chado->resultset('Cv::Cv')
->find_or_create({ name => 'My Fake Ontology' })
->create_related( 'cvterm', { name => 'MyFakeTerm' } );
});
----
= The Real Advantages of DBIC
* easier to manipulate and assemble queries
* Don't Repeat Yourself
----
= The Real Advantages of DBIC
* it's all objects. you can delegate to them, pass them around, etc.
* HOWEVER:
+** usually you don't want to subclass them
** but, see |DBIx::Class::Manual::Cookbook|
----
= The Real Advantages of DBIC
* complex joined queries (Chado queries) are very easy and compact
----
= The Real Advantages of DBIC
* SQL syntax errors are much more difficult to make
----
= Using DBIC with your own tables
* use |DBIx::Class::Schema::Loader| to dump a whole set
* make your own definitions
* your table: other_thing, foreign key feature_id to Chado feature table
package My::DBIC:::Layer::OtherThing;
use base 'DBIx::Class::Core';
__PACKAGE__->table('other_thing');
__PACKAGE__->add_columns(
'other_thing_id' => { ... },
'name' => { ... },
'definition' => { ... },
'feature_id' => { ... },
);
__PACKAGE__->set_primary_key('other_thing_id');
__PACKAGE__->add_unique_constraint('ot_c1', ['name']);
__PACKAGE__->belongs_to(
'feature',
'Bio::Chado::Schema::Sequence::Feature',
{ 'foreign.feature_id' => 'self.feature_id' },
);
----
= "Duct tape" BCS to your own schema
* make an accessor 'other_things' that ties your own DBIC class to BCS
+ Bio::Chado::Schema::Sequence::Feature->has_many(
'other_things',
'My::DBIC::Layer::OtherThing',
{ 'foreign.feature_id' => 'self.feature_id' },
);
+* add it to the BCS schema dynamically
+ Bio::Chado::Schema->register_source('OtherThing', 'My::DBIC::Layer::OtherThing');
+* use it with the rest
+ $chado->resultset('Sequence::Feature')->other_things;
----
= Making a composite schema
my $merged_schema_class =
Bio::Chado::Schema->merge( 'My::DBIC::Layer' );
$merged_schema_class->connect( ... );
$chado->resultset('Sequence::Feature')->other_things;
$chado->resultset('OtherThing')->find(...)->feature;
Note: merge() is new in BCS 0.6, releasing soon.
----
= Further Work
== Still need to add in some DBIx::Class relationships:
* more many_to_many relationships (must be added manually)
== More useful things are needed:
* automate more common querying and loading patterns
* compatibility with BioPerl data objects
----
== Acknowledgments
* Aure Bombarely (SGN)
* Naama Menda (SGN)
* Siddhartha Basu (dictybase)
* Lukas Mueller (SGN)
----
== That's All
* The END