++ed by:
PERLANCAR KEEDI AWNCORP

3 PAUSE users
7 non-PAUSE users.

Laurent Dami

NAME

DBIx::DataModel - Classes and UML-style Associations on top of DBI

SYNOPSIS

in file "MySchema.pm"

Declare the schema

  use DBIx::DataModel;
  DBIx::DataModel->Schema('MySchema'); # MySchema is now a Perl package

Declare the tables with (Perl_name => DB_name => primary key column(s)). Each table then becomes a Perl package.

  MySchema->Table(Employee   => Employee   => qw/emp_id/);
  MySchema->Table(Department => Department => qw/dpt_id/);
  MySchema->Table(Activity   => Activity   => qw/act_id/);

Declare associations in UML style ( [table1 role1 arity1 join1], [table2...]).

  MySchema->Association([qw/Activity   activities * emp_id/],
                        [qw/Employee   employee   1 emp_id/]);
  MySchema->Association([qw/Activity   activities * dpt_id/],
                        [qw/Department department 1 dpt_id/]);

Declare "column types" with some handlers ..

  # date conversion between database (yyyy-mm-dd) and user (dd.mm.yyyy)
  MySchema->ColumnType(Date => 
     fromDB   => sub {$_[0] =~ s/(\d\d\d\d)-(\d\d)-(\d\d)/$3.$2.$1/},
     toDB     => sub {$_[0] =~ s/(\d\d)\.(\d\d)\.(\d\d\d\d)/$3-$2-$1/},
     validate => sub {$_[0] =~ m/(\d\d)\.(\d\d)\.(\d\d\d\d)/});
  
  # 'percent' conversion between database (0.8) and user (80)
  MySchema->ColumnType(Percent => 
     fromDB   => sub {$_[0] *= 100 if $_[0]},
     toDB     => sub {$_[0] /= 100 if $_[0]},
     validate => sub {$_[0] =~ /1?\d?\d/});

.. and apply these "column types" to some of our columns

  Employee->ColumnType(Date    => qw/d_birth/);
  Activity->ColumnType(Date    => qw/d_begin d_end/);
  Activity->ColumnType(Percent => qw/employment_rate/);

Declare a column that will be filled automatically at each update

  MySchema->AutoUpdateColumns(last_modif => 
    sub{$ENV{REMOTE_USER}.", ".scalar(localtime)});

For details that could not be expressed in a declarative way, just add a new method into the table class (but in that case, Schema and Table declarations should be in a BEGIN block, so that the table class is defined before you start adding methods to it).

  package Activity; 
  
  sub activePeriod {
    my $self = shift;
    $self->{d_end} ? "from $self->{d_begin} to $self->{d_end}"
                   : "since $self->{d_begin}";
  }

Declare how to automatically expand objects into data trees

  Activity->AutoExpand(qw/employee department/);

in file "myClient.pl"

  use MySchema;

Search employees whose name starts with 'D' (select API is taken from SQL::Abstract)

  my $empl_D = Employee->select({lastname => {-like => 'D%'}});

idem, but we just want a subset of the columns

  my $empl_F = Employee->select(-columns => [qw/firstname lastname emp_id/],
                                -where   => {lastname => {-like => 'F%'}});

Get a list of employee names in age order

  my $ageLst = Employee->select([qw/lastname firstname/], {}, ['d_birth']);

Print some info from employees, either by direct access to the hashref, or by a method call (handled via AUTOLOAD). Because of the 'fromDB' handler associated with column type 'date', column 'd_birth' has been automatically converted to display format.

  foreach my $emp (@$empl_D) {
    printf "%s (%s) : %s\n", 
      $emp->{firstname}, $emp->lastname, $emp->d_birth;
  }

Follow the joins through role methods

  foreach my $act (@{$emp->activities}) {
    printf "working for %s from $act->{d_begin} to $act->{d_end}", 
      $act->department->name;
  }

Role methods can take arguments too, like select()

  my @recentAct  = $dpt->activities({d_begin => {'>=' => '2005-01-01'}});
  my @recentEmpl = map {$_->employee([qw/firstname lastname/])} @recentAct;

Export the data : external helper modules usually expect a full data tree (instead of calling methods dynamically), so we need to expand the objects :

  $_->expand('activities') foreach @$empl_D;
  my $export = {employees => $empl_D};
  use Data::Dumper; print Dumper   ($export); # export as PerlDump
  use XML::Simple;  print XMLout   ($export); # export as XML
  use JSON;         print objToJson($export); # export as Javascript

Select associated tables directly from a database join, in one single SQL statement (instead of iterating through role methods).

  my $lst = MySchema->ViewFromRoles(qw/Employee activities department/)
                    ->select([qw/lastname dept_name d_begin/], 
                             {d_begin => {'>=' => '2000-01-01'}});

DESCRIPTION

Introduction

This is yet, yet, yet another wrapper framework to build Perl classes and objects around database tables and records. There are many other CPAN modules in this area; perhaps the mere fact that they are so numerous demonstrates that there is more than one way to do it, and none is obviously the best, so why not propose another one ? The "SEE ALSO" section at the end of this documentation gives some pointers.

DBIx::DataModel is written compactly as one single module, and only depends on DBI and SQL::Abstract. It is intended to help client applications in performing common tasks such as data conversion and associations between tables, while retaining an open access both to the base DBI layer and to the basic Perl datastructures, whenever lower-level operations are needed. The focus is on building trees of data which can then be passed to external helper modules for generating XML, Perl dumps, javascript JSON, templates of the Template Toolkit, etc. Such modules need to walk on the data tree, so they cannot work if everything is implemented as OO methods to be called on demand (because there is no simple way to ask for all available methods, and even if you get there, it is not possible to distinguish which of those methods encapsulate relevant data). Therefore DBIx::DataModel does not insist on OO information hiding; on the contrary, direct access to the object hash is encouraged for inspecting the data. In the same spirit, transaction handling is left to the client code.

DBIx::DataModel defines an API for accessing the database from Perl, but will not create the database itself. So use your best database administration tools to define your schemas, tables, keys, relationships and integrity rules; then tell the bare minimum to DBIx::DataModel so that Perl programs can work with the data. To do so, you first declare your tables with their primary keys. Then you declare UML binary associations, which will create the role methods to walk through the data in both directions, possibly with some additional WHERE filters. From your associations, you can also generate some Views to directly query a list of tables, with the appropriate joins between them. At each method call, it is possible to specify which subset of columns should be retrieved (or to rely on the default, namely '*').

Columns may have some associated handlers for performing data transformation of validation. You may also define column types in your schema so that every column of a given type inherits the same collection of handlers (think for example of a 'date' type or a 'phoneNumber' type).

DISCLAIMER: this code is still in beta, the API may slightly change in future versions.

Quickstart

The reference documentation below is quite long, but many details are for advanced usage. For simple tasks you might just want to follow the synopsis :

  1. declare your tables through the Table method

  2. declare your UML binary associations through the Association method

  3. call the select method on one of your main tables

  4. call the role methods to get some data from associated tables

  5. report or update

METHODS

General convention : method names starting with an uppercase letter are meant to be compile-time class methods. These methods will typically be called when loading a module like 'MySchema.pm', and therefore will be executed during the BEGIN phase of the Perl compiler. They instruct the compiler to create classes, methods and datastructures for representing the elements of a database schema.

Method names starting with a lowercase letter are meant to be usual run-time methods, either for classes or for instances.

Compile-time methods

Framework methods

Schema

  DBIx::DataModel->Schema($schemaName [ => $dbh ] )

Creates a new Perl class of name $schemaName that represents a database schema. That class inherits from DBIx::DataModel. The connection to a DBI database handle can be set via the optional $dbh argument (but it can also be set or reset later) via the dbh method.

Schema methods

SqlDialect

  MySchema->SqlDialect($builtinDialect);
  MySchema->SqlDialect(innerJoin         => "%s INNER JOIN (%s) ON %s",
                       leftJoin          => ...,
                       joinAssociativity => "left" | "right");

SQL has no standard syntax for performing joins, so if your database wants a particular syntax you will need to declare it. Current builtin dialects are either MsAccess, BasisODBC or Default (contributions to enrich this list are welcome). Otherwise you supply the following information :

innerJoin

a string in "sprintf" in perlfunc format, with placeholders for the left table, the right table and the join criteria. Default is %s INNER JOIN %s ON %s. If your database does not support inner joins, set this to undef and the generated SQL will be in the form T1, T2, ... Tn WHERE ... AND ... .

leftJoin

a string for left outer joins. Default is %s LEFT OUTER JOIN %s ON %s.

joinAssociativity

either left or right

Table

  MySchema->Table($pckName, $dbTable, @primKey)

Creates a new Perl class of name $pckName that represents a database Table. That class inherits from MySchema. $dbTable should contain the name of the table in the database. @primKey should contain the name(s) of the column(s) holding the primary key for that table. This info will be used for interpreting arguments to the fetch method, and for filling WHERE clauses in the SQL generated by the update method.

View

  MySchema->View($viewName =>$columns => $dbTables => \%where, @parentTables)

Creates a new Perl class of name $viewName that represents a SQL SELECT request. The only method of that class is the select method, which will

  • select records from the database according to the criteria of the view, merged with the criteria of the request;

  • apply the 'fromDB' handlers of the parent tables to those records;

  • bless the results into objects of MyView that inherit the role methods of each of the @parentTables (see method Association below).

Views are useful to build queries with specific SQL clauses like for example

  MySchema->View(MyView =>
     "DISTINCT column1 AS c1, t2.column2 AS c2",
     "Table1 AS t1 LEFT OUTER JOIN Table2 AS t2 ON t1.fk=t2.pk",
     {c1 => 'foo', c2 => {-like => 'bar%'}},
     @parentTables)

See SQL::Abstract and the select method below for a complete description of what to put in the %where argument. For the moment, just consider the following example:

  my $lst = MyView->select({c3 => 22});

would generate the SQL statement:

  SELECT DISTINCT column1 AS c1, t2.column2 AS c2
  FROM  Table1 AS t1 
        LEFT OUTER JOIN Table2 AS t2 ON t1.fk=t2.pk
  WHERE (c1 = 'foo' AND c2 LIKE 'bar%' AND c3 = 22)

The optional list of @parentTables contains names of Perl table classes from which the view will inherit (so that role methods of these tables become available to instances of MyView).

Perl views as defined here have nothing to do with views declared in the database itself. Perl views are totally unknown to the database, they are just abstractions of SQL statements. If you need to access database views, just use the Table declaration, like for a regular table.

Association

  MySchema->Association([$table1, $role1, $arity1, @columns1], 
                        [$table2, $role2, $arity2, @columns2]);

Declares an association between two tables (or even two instances of the same table), in a UML-like fashion. Each side of the association specifies its table, the "rolename" of of this table in the association, the arity (also called "multiplicity"), and the name of the column or list of columns that technically implement the association as a database join. Arities should be written in the UML form '0..*', '1..*', '0..1', etc. (minimum .. maximum number of occurrences); this will influence how role methods and views are implemented, as explained below. Arity '*' is a shortcut for '0..*', and arity '1' is a shortcut for '1..1'. Role names should be chosen so as to avoid conflicts with column names in the same table.

As a result of the association declaration, the Perl class corresponding to $table1 will get an additional method named $role2 for accessing the associated object(s) in $table2; that method normally returns an arrayref, unless $arity2 has maximum '1' (in that case the return value is a single object ref). Of course, $table2 conversely gets a method named $role1.

Such role methods perform joins within Perl (as opposed to joins directly performed within the database). That is, given a declaration

  MySchema->Association([qw/Activity   activities 0..* emp_id/],
                        [qw/Employee   employee   1 emp_id/]);

we can call

  my @acts = $anEmployee->activities

which will implicitly perform a

  SELECT * FROM Activity WHERE emp_id = $anEmployee->{emp_id}

The role method can also accept additional parameters in SQL::Abstract format (see also the select method in this module). So for example

  my @acts = $anEmployee->activities([qw/act_name salary/], {isActive => 'Y'});

would perform the following SQL request :

  SELECT act_name, salary FROM Activity WHERE 
    emp_id = $anEmployee->{emp_id} AND
    isActive = 'Y'

To specify a unidirectional association, just supply 0 or an empty string (or even the string "0" or '""' or "none") to one of the role names. In that case the corresponding role method is not generated.

The Association method only supports binary associations; however, you can create a View from a series of associations, in order to simultaneously join many tables : see method ViewFromRoles.

ViewFromRoles

  my $view = MySchema->ViewFromRoles($table => $role1 => $role2 => ..);

Creates a View class, starting from a given Table Class and following one or several associations through their role names. It calls the View method, with a collection of parameters automatically inferred from the associations. So for example

  MySchema->ViewFromRoles(qw/Department activities employee/)

is equivalent to

  my $sql = <<_EOSQL_
  Department 
    LEFT OUTER JOIN Activity ON Department.dpt_id = Activity.dpt_id
    LEFT OUTER JOIN Employee ON Activity.emp_id   = Employee.emp_id
  _EOSQL_
  
  MySchema->View(DepartmentActivitiesEmployee => '*' => $sql => 
                 qw/Department Activity Employee/);

For each pair of tables, the kind of join is chosen according to the arity declared with the role : if the minimum arity is 0, the join will be LEFT OUTER JOIN; otherwise it will be a usual inner join. (exception : after a first left join, all remaining tables are also connected through additional left joins).

The view name will be composed by concatenating the table and the capitalized role names. Since such names might be long and uncomfortable to use, the view name is also returned as result of the method call, so that the client code can store it in a variable and use it as an alias.

The main purpose of ViewFromRoles is to gain efficiency in interacting with the database. If we write

  foreach my $dpt (Department->select) {
    foreach my $act ($dpt->activities) {
      my $emp = $act->employee;
      printf "%s works in %s since %s\n", 
         $emp->lastname, $dpt->dpt_name, $act->d_begin
    }
  }

many database calls are generated behind the scene. Instead we could write

  my $view = MySchema->ViewFromRoles(qw/Department activities employee/);
  foreach my $row ($view->select) {
    printf "%s works in %s since %s\n", 
      $row->lastname, $row->dpt_name, $row->d_begin

  }

which generates one single call to the database.

Currently, ViewFromRoles is stupidly linear. It does not support multiple roles originating in the same table. Neither does it support multiple occurrences of the same table (through self-referential associations). These restrictions will be removed in a future release, but will require a more sophisticated API (tree structure).

Hybrid methods (for Schema or Table classes)

ColumnType

  MySchema->ColumnType(typeName => handler1 => coderef1,
                                   handler2 => coderef2, ...)
  MyTable ->ColumnType(typeName => qw/column1 column2 .../)

When applied to a schema class, this method declares a column type of name typeName, to which a number of handlers are associated (see methods ColumnHandlers and applyColumnHandlers).

When applied to a table class, the handlers associated to typeName are registered for each of the columns column1, column2 (throug a call to ColumnHandlers).

Such column types can be used for example for automatic conversion of values between database and memory, through 'fromDB' and 'toDB' handlers.

NoUpdateColumns

  MySchema->NoUpdateColumns( @columns );
  MyTable ->NoUpdateColumns( @columns );

Sets an array of column names that will be excluded from INSERT/UPDATE statements. This is useful for example when some column are set up automatically by the database (like automatic time stamps or user identification). It can also be useful if you want to temporarily add information to memory objects, without passing it back to the database.

NoUpdate columns can be set for a whole Schema, or for a specific Table class.

AutoUpdateColumns

  MySchema->AutoUpdateColumns( columnName1 => sub{...}, ... );
  MyTable ->AutoUpdateColumns( columnName1 => sub{...}, ... );

Declares handler code that will automatically fill column names columnName1, etc. at each update, either for a single table, or (if declared at the Schema level), for every table. For example, each record could remember who did the last modification with something like

  MySchema->AutoUpdateColumns( last_modif => 
    sub{$ENV{REMOTE_USER} . ", " . localtime}
  );

The handler code will be called as

  $handler->($record, $table)

so that it can know something about its calling context. In most cases, however, the handler will not need these parameters, because it just returns global information such as current user or current date/time.

Table methods

ColumnHandlers

  Table->ColumnHandlers($columnName => handlerName1 => coderef1,
                                       handlerName2 => coderef2, ...)

Associates some handlers to a given column in the current table class. Then, when you call $obj->applyColumnHandlers('someHandler'), each column having a handler of the corresponding name will execute the associated code. This can be useful for all sorts of data manipulation :

  • converting dates between internal database format and user presentation format

  • converting empty strings into null values

  • inflating scalar values into objects

  • column data validation

Handlers receive the column value as usual through $_[0]; conversion handlers should modify the value in place (beware not to modify a local copy, which would be a no-op). Handlers also receive additional info in in the remaing arguments :

  sub myHandler {
    my ($columnValue, $obj, $columnName, $handlerName) = @;
    my $newVal = $obj->computeNewVal($columnValue, ...);
    $columnValue = $newVal; # WRONG : will be a no-op
    $_[0] = $newVal;        # OK    : value is converted
  }

The second argument $obj is the object from where $columnValue was taken ; it may be either a plain Perl hashref, or an object of a Table or View class (it all depends on the client application, so the handler cannot make any assumptions).

Handler names 'fromDB' and 'toDB' have a special meaning : they are called automatically just after reading data from the database, or just before writing into the database. Handler name 'validate' is used by the method hasInvalidColumns.

AutoExpand

  Table->AutoExpand(qw/role1 role2 .../)

Generates an autoExpand method for the class, that will autoexpand on the roles listed (i.e. will call the appropriate method and store the result in a local slot within the object). In other words, the object knows how to expand itself, fetching information from associated tables, in order to build a data tree in memory.

Be careful to avoid loops when specify autoexpands, otherwise you will generate an infinite tree and break your program. For example this would be problematic :

  Employee->Autoexpand(qw/activities/);
  Activity->Autoexpand(qw/employee/);

Runtime methods

Utility methods common to the whole framework

isSchema

True if the invocant is a schema class.

classData

Returns a ref to a hash for storing class-specific data. Each subclass has its own hashref, so class data is NOT propagated along the inheritance tree.

schema

Returns the name of the schema class for the current object.

dbh

  Schema->dbh( [$dbh] );
  Table->dbh;
  $record->dbh;

Returns or sets the handle to a DBI database handle (see DBI). This handle is schema-specific. DBIx::DataModel expects the handle to be opened with RaiseError => 1 (see below "Transactions and error handling").

debug

  Schema->debug( 1 );            # will warn for each SQL statement
  Schema->debug( $debugObject ); # will call $debugObject->debug($sql)
  Schema->debug( 0 );            # turn off debugging

noUpdateColumns

Returns the array of column names declared as noUpdate, either in the Schema or in the Table class of the invocant.

autoUpdateColumns

Returns the array of column names and associated handlers declared as autoUpdate, either in the Schema or in the Table class of the invocant.

Table class methods

table

Returns the database table name registered via registerTable(..).

primKey

Returns the list of primary keys registered via registerTable(..).

blessFromDB

  Table->blessFromDB($record) 

Blesses $record into an object of the current class, and applies the fromDB column handlers.

select

  MyTable->select(-columns => \@columns, 
                  -where   => \%where, 
                  -orderBy => \@order)
  MyTable->select(\@columns, \%where, \@order)
  MyView ->select(\@columns, \%where, \@order)

Applies a SQL SELECT to the associated table (or view), and returns a ref to the array of resulting records, blessed into objects of the current class. Arguments are all optional and may be passed either by name or by position. The API is borrowed from SQL::Abstract :

  • the first argument \@columns is a reference to a list of column names. Actually, it can also be a string, such as "column1 AS c1, column2 AS c2". If omitted, \@columns takes the default MyTable->columns, which is usually '*'.

  • the second argument \%where is a reference to a hash or array of criteria that will be translated into SQL clauses. In most cases, this will just be something like {col1 => 'val1, col2 => 'val2}; see SQL::Abstract::select for a detailed description of the structure of that hash or array. It can also be a plain SQL string like "col1 IN (3, 5, 7, 11) OR col2 IS NOT NULL".

  • the third argument \@order is a reference to a list of columns for sorting. Again it can also be a plain SQL string like "col1 DESC, col3, col2 DESC".

No verification is done on the list of retrieved \@columns, so it is OK if the list does not contain primary or foreign keys --- but then later attempts to perform joins or updates will obviously fail.

In addition to these "official" arguments, the method may also take a 4th argument <-moreWhere> (hashref), used internally by role methods for passing join criteria. In that case, entries of the "moreWhere" hash are combined with the entries of the regular "where" hash before building the SQL statement. You only need to understand about this if you intend to write additional role methods; an example is shown in section "Self-referential associations" below.

selectSth

  MyTable->selectSth(\@columns, \%where, \@order)

This exactly like the select() method, except that it returns an executed DBI statement handle instead of an arrayref of objects. Use this method whenever you want to iterate yourself through the results :

  my $sth = MyTable->selectSth( ... );
  while (my $row = $sth->fetchrow_hashref) {
    MyTable->blessFromDB($row);
    workWith($row);
  }

preselectWhere

  my $meth = MyTable->preselectWhere({col1 => $val1, ...})

Returns a reference to a function that will select data from MyTable, just like the select() method, but where some additional selection criteria are "preselected". The preselection criteria are specified in SQL::Abstract format. This method is mainly for internal use; you only want to learn about it if you intend to write your own role methods.

fetch

  MyTable->fetch(@keyValues) 

Searches the single record whose primary key is @keyValues. Returns undef if none is found or if an error is encountered (check DBI::err to find out which).

insert

  MyTable->insert({col1 => $val1, col2 => $val2, ...}, {...})

Inserts new records into the database, after having applied the 'toDB' handlers. This operation may modify the argument data (manipulating values through 'toDB' handlers, or deleting columns declared as 'noUpdate'). See also section "Transactions and error handling" below.

applyColumnHandlers

  Table->applyColumnHandlers($handlerName, \@objects);
  $record->applyColumnHandlers($handlerName);

This is both a class and an instance method. It looks for all columns having a handler named $handlerName (see method ColumnHandlers for how to declare handlers, and for the special handler names 'fromDB', 'toDB' and 'validate'). Found handlers are then applied, either to the current object, or to the list of objects supplied in the optional second argument. The results of handler calls are collected into a hashref, with an entry for each column name. The value of each entry depends on how applyColumnHandlers was called : if it was called as an instance method, then the result is something of shape

  {columnName1 => resultValue1, columnName2 => resultValue2, ... }

if it was called as a class method (i.e. if \@objects is defined), then the result is something of shape

  {columnName1 => [resultValue1forObject1, resultValue1forObject2, ...], 
   columnName2 => [resultValue2forObject1, resultValue2forObject2, ...], 
   ... }

If columnName is not present in the target object(s), then the corresponding result value is undef.

update

  MyTable->update({column1 => value1, ...});
  MyTable->update(@primKey, {column1 => value1, ...});
  $record->update;

This is both a class and an instance method. It updates the database after having applied the 'toDB' handlers. See also below "Transactions and error handling".

When called as a class method, the columns and values to update are supplied as a hashref. The second syntax with @primKey is an alternate way to supply the values for the primary key (it may be more convenient because you don't need to repeat the name of primary key columns). So if emp_id is the primary key of table Employee, then the following are equivalent :

  Employee->update({emp_id => $eid, address => $newAddr, phone => $newPhone});
  Employee->update($eid => {address => $newAddr, phone => $newPhone});

When called as an instance method, the columns and values to update are taken from the object in memory. After the update, the memory for that object is destroyed (to prevent any confusion, because the 'toDB' handlers might have changed the values). So to continue working with the same record, you must fetch it again from the database.

In either case, you have no control over the 'where' clause of the SQL update statement, that will be based exclusively on primary key columns. So if you need to simultaneously update several records with a SQL request like

  UPDATE myTable SET col='newVal' WHERE otherCol like 'criteria%'

then you should generate the SQL yourself and pass it directly to DBI->do($sql).

The update method only updates the columns received as arguments : it knows nothing about other columns that may sit in the database table. Therefore if you have two concurrent clients doing

  (client1)  MyTable->update($id, {c1 => $v1, c2 => $v2});
  (client2)  MyTable->update($id, {c3 => $v3, c4 => $v4, c5 => $v5});

the final state of record $id in the database is consistent. This would not be the case in an RDBMS-OO mapping framework that systematically updates all columns.

delete

  Table->delete({column1 => value1, ...});
  Table->delete(@primKey);
  $record->delete;

This is both a class and an instance method. It deletes a record from the database. See also below "Transactions and error handling".

When called as a class method, the primary key of the record to delete is supplied either as a hashref, or directly as a list of values. Note that MyTable->delete(11, 22) does not mean "delete records with keys 11 and 22", but rather "delete record having primary key (11, 22)"; in other words, you only delete one record at a time. In order to simultaneously delete several records, you must generate the SQL yourself and go directly to the DBI level.

When called as an instance method, the primary key is taken from object columns in memory. After the delete, the memory for that object is destroyed.

Methods for records only (instances of Tables)

hasInvalidColumns

Applies the 'validate' handler to all existent columns. Returns a ref to the list of invalid columns, or undef if there are none.

Note that this is validation at the column level, not at the record level. As a result, your validation handlers can check if an existent column is empty, but cannot check if a column is missing (because in that case the handler would not be called).

Your 'validate' handlers, defined through ColumnHandlers, should return 0 or an empty string whenever the column value is invalid. Never return undef, because we would no longer be able to distinguish between an invalid existent column and a missing column.

expand

  $obj->expand($role [, @args] )

Executes the method $role to follow an Association, and stores the result in the object itself under $obj->{$role}. This is typically used to expand an object into a tree datastructure. Optional @args are passed to $obj->$role(@args).

After the expansion, beware of the difference between $obj->{$role} (the stored result) and $obj->$role (calling the role again through a new SQL query).

autoExpand

  $record->autoExpand( [$recurse] )

Asks the object to expand itself with some objects in foreign tables. By default does nothing, should be redefined in subclasses, most probably through the AutoExpand method (with capital 'A'). If the optional argument $recurse is true, then autoExpand is recursively called on the expanded objects.

selectFromRoles

  my $lst = $obj->selectFromRoles(qw/role1 role2 .../)
                ->(-columns => [...], -where => {...}, -orderBy=>[...]);

Starting from a given object, returns a reference to a function that selects a collection of data rows from associated tables, performing the appropriate joins. Internally this is implemented throught the ViewFromRoles method, with an additional join criteria to constrain on the primary key(s) of $obj. The returned function takes the same arguments as the select method. So for example if $emp->emp_id == 987, then

  $emp->selectFromRoles(qw/activities department/)->({d_end => undef})

will generate

  SELECT * FROM Activity INNER JOIN Department 
                         ON Activity.dpt_id = Department.dpt_id
           WHERE  emp_id = 987 AND d_end IS NULL

automatic column read accessors through AUTOLOAD

Columns have implicit read accessors through AUTOLOAD. So instead of $record->{column} you can write $record->column.

I know this is a bit slower than generating all accessors (through Class::Accessor or something similar), but the advantage is that you don't need to know all column names in advance. This is how we support variable column lists (two instances of the same Table do not necessarily hold the same set of columns, it all depends on what you chose when doing the SELECT).

OTHER CONSIDERATIONS

Namespaces, classes, methods

DBIx::DataModel automatically generates Perl classes for Schemas, Tables, Views, Associations. Before doing so, it checks that no Perl package of the same name already exists. A similar check is performed before adding role methods into classes.

The client code can insert additional methods into the generated classes : just switch to the package and define your code. However, because of the security checks just mentioned, DBIx::DataModel must create the package before you start adding methods to it, and therefore the declarations should be inside a BEGIN block :

  BEGIN { # make sure these declarations are immediately executed
  
    DBIx::DataModel->Schema('MySchema'); 
    MySchema->Table(Activity   => Activity   => qw/act_id/);
    ...
  }
  
  # now we can safely add new methods
  
  package Activity; 
  
  sub activePeriod {
    my $self = shift;
    $self->{d_end} ? "from $self->{d_begin} to $self->{d_end}"
                   : "since $self->{d_begin}";
  }
  
  package main;                 # switch back to the 'main' package

See perlmod for an explanation of BEGIN blocks.

Interaction with the DBI layer

Transactions and error handling

DBIx::DataModel follows the recommendations of DBI for transactions : it expects the database handle to be opened with RaiseError => 1 and therefore does not check itself for DBI errors ; it is up to the client code to catch the exceptions and deal with errors.

As explained in "Transactions" in DBI, AutoCommit should be set off for databases that support transactions; then atomic operations are enclosed in an eval, followed by either $dbh->commit() (in case of success) or $dbh->rollback() (in case of failure).

Calling DBI directly

Consider again the following excerpt from the SYNOPSIS :

  package Departement; 
  
  sub currentEmployees {
    my $self = shift;
    my $currentAct = $self->activities({d_end => [{-is  => undef},
                                                  {">" => $today}]});
    return map {$_->employee} @$currentAct;
  }

This code crosses two tables and generates n + 1 calls to the database , where n is the number of current activities in the department. This can be optimized by performing the join within the database, instead of doing it in Perl, which reduces to one single call to the database. So if we are ready to code directly at the DBI level, we could write

  package Departement; 
  
  sub currentEmployees {
    my $self = shift;
    my $sql = "SELECT Employee.* FROM Employee, Activity WHERE ".
              "Activity.emp_id = Employee.emp_id AND ".
              "(d_end is null or d_end <= '$today')";
    my $empl = $self->dbh->selectall_arrayref($sql, {Slice => {}});
    Employee->blessFromDB($_) foreach @$empl;
    return $empl;
  }

Actually, in this example there is an even simpler way to do it:

  package Departement; 
  
  sub currentEmployees {
    my $self = shift;
    
    MySchema->ViewFromRoles(qw/Activity employee/)
            ->select("Employee.*", {d_end => [{-is  => undef},
                                              {">" => $today}]});
  }

Self-referential associations

Associations can be self-referential, i.e. describing tree structures :

  MySchema->Association([qw/OrganisationalUnit parent   1 ou_id/],
                        [qw/OrganisationalUnit children * parent_ou_id/],

However, when there are several self-referential associations, we might get into problems : consider

  MySchema->Association([qw/Person mother   1 pers_id/],
                        [qw/Person children * mother_id/]);
  MySchema->Association([qw/Person father   1 pers_id/],
                        [qw/Person children * father_id/]); # BUG

This does not work because there are two definitions of the "children" role name in the same class "Person". One solution is to distinguish these roles, and then write by hand a general "children" role :

  MySchema->Association([qw/Person mother         1 pers_id/],
                        [qw/Person motherChildren * mother_id/]);
  MySchema->Association([qw/Person father         1 pers_id/],
                        [qw/Person fatherChildren * father_id/]);
  
  package Person;
  sub children {
    my $self = shift;
    my $id = $self->{pers_id};
    my $sql = "SELECT * FROM Person WHERE mother_id = $id OR father_id = $id";
    my $children = $self->dbh->selectall_arrayref($sql, {Slice => {}});
    Person->blessFromDB($_) foreach @$children;
    return $children;
  }

Alternatively, since rolenames motherChildren and fatherChildren are most probably useless, we might just specify unidirectional associations :

  MySchema->Association([qw/Person mother  1 pers_id/],
                        [qw/Person none    * mother_id/]);
  MySchema->Association([qw/Person father  1 pers_id/],
                        [qw/Person none    * father_id/]);

And there is a more sophisticated way to define the "children" method, that will accept additional "where" criteria, like every regular method.

  package Person;
  sub children {
    my $self = shift;
    my $id = $self->{pers_id};
    Person->preselectWhere([mother_id => $id, father_id => $id])->(@_);
  }

This definition takes advantage of the hidden "moreWhere" argument of the select method to force the join on mother_id or father_id, while leaving open the possibility for the caller to specify additional criteria. For example, all female children of a person (either father or mother) can now be retrieved through

  $person->children({gender => 'F'})

Observe the use of the -nest operator from SQL::Abstract to insert an arrayref (that will generate an SQL 'OR').

SEE ALSO

Some alternative modules in this area are Alzabo, Tangram, SPOPS, Class::PObject, Class::DBI, DBIx::RecordSet, DBIx::SQLEngine,DBIx::Record, DBIx::Class, and a lot more in the DBIx::* namespace, all with different approaches. For various reasons, none of these did fit nicely in my context, so I decided to write DBIx:DataModel. Of course there might be also many reasons why DBIx:DataModel will not fit in your context, so just do your own shopping. A good place to start would be the general discussion on RDBMS - Perl mappings at http://poop.sourceforge.net. There are also some pointers in the Perl 5 Enterprise Environment website at http://www.officevision.com/pub/p5ee/.

TO DO

  - autodeclare tables and associations from $dbh->table_info, etc.
  - 'hasInvalidColumns' : should be called automatically before insert/update ?
  - 'validate' record handler
  - 'normalize' handler : for ex. transform empty string into null
  - walk through WHERE queries and apply 'toDB' handler (not obvious!)
  - add a 'Column' method to tables, so they can declare which columns
    should be retrieved by default
  - add the UML notions of Aggregation and Composition (would mean 
    additional methods for adding and removing parts of an aggregate;
    automatic deletion of composite parts)
  - decide what to do with multiple inheritance of role methods in Views. 
    use NEXT ?
  - table aliases
  - keep track of all tables, views, assoc. in schema->classData ? useful ?
  - insert() method : return last_insert_id(s) from DBI
  - remove restrictions in ViewFromRoles

AUTHOR

Laurent Dami, <laurent.dami AT etat geneve ch>

COPYRIGHT AND LICENSE

Copyright 2005 by Laurent Dami.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.