NAME

DBIx::DataModel::Doc::Reference - General reference for DBIx::DataModel

DOCUMENTATION CONTEXT

This chapter is part of the DBIx::DataModel manual.

WARNING : Since version 2.0, a number of important changes have been made in the API (classes renamed, arguments renamed or reorganized, etc. -- see DBIx::DataModel::Doc::Delta_v2). If you need backwards compatibility, make sure to load the 1.0 compatibility layer.

INTRODUCTION

This chapter is the detailed reference description for most public methods in DBIx::DataModel.

Just a couple of peripheric classes are described in separate documents : this is the case for DBIx::DataModel::Schema::Generator (automatic schema generation from external sources), and for DBIx::DataModel::Statement::JDBC (special methods for working with DBD::JDBC).

META-SCHEMA DEFINITION

Most methods listed below for defining schema, tables, associations, etc. come in two flavours :

  • a "front-end" method, starting with an uppercase letter, that uses positional parameters. This version is prefered for conciseness and for backwards compatibility.

  • a "back-end" method, with a name of shape define_*, that uses named parameters. This version is prefered for completeness.

The invocant for front-end methods can be either the Schema class name, or the meta-schema instance. The invocant for back-end methods must be the meta-schema instance.

Schema()

  my $schema_class = DBIx::DataModel->Schema($schema_name, %options);

Front-end method for "define_schema()". The call above is equivalent to

  my $meta_schema = DBIx::DataModel->define_schema(
    class => $schema_name,
    %options,
   );
  my $schema_class = $meta_schema->class;

define_schema()

  my $meta_schema = DBIx::DataModel->define_schema(
     class => $schema_name, 
     %options,
   );

Creates a new Perl class of name $schema_name that represents a database schema. That class inherits from DBIx::DataModel::Schema. At the same time, an instance of DBIx::DataModel::Meta::Schema is also created, for holding meta-information about that schema (for example lists of classes, associations, types, etc. that will be declared within the schema). Possible %options are :

class

Name of the Perl schema class to be created.

isa

Parent class(es) for that schema class (scalar or arrayref). The default parent is DBIx::DataModel::Schema.

auto_insert_columns

A hashref specifying columns to be automatically inserted in every table. Keys of the hashref are column names, values are handlers (coderefs). This can also be done separately for each table (see the auto_insert_columns parameter to "define_table()").

For example, each record could remember who created it and when, with something like

  my $who_and_when = sub { $ENV{REMOTE_USER} . " at " . localtime };

  --->define_schema( 
        ...,
        auto_insert_colums => {created_by => $who_and_when},
      );

The handler code will be called as

  $handler->(\%record, $table_class)

so that it can know something about its calling context. In most cases, however, the handler will not need these parameters, because it just returns global information such as current user or current date/time.

auto_update_columns

A hashref specifying columns to be automatically updated in every table. Keys of the hashref are column names, values are handlers (coderefs). This specification can also be done separately for each table (see the auto_update_columns parameter to "define_table()"). Such handlers are called at each update and each insert; if you use both auto_insert_columns and auto_update_columns, make sure that the column names are not the same.

For example, elaborating on the previous section, each record could also remember the last modification with something like

  ...->define_schema( 
         ...,
         auto_insert_colums => {created_by => $who_and_when},
         auto_update_colums => {updated_by => $who_and_when},
       );
no_update_columns

A hashref specifying columns to be automatically ignored in insert/update operations, for every table. This can also be done separately for each table (see the no_update_columns parameter to "define_table()"). Keys of the hashref are column names, values are indifferent (can be any true value).

This is useful for example when some column are set up automatically by the database (like automatic time stamps or user identification). It can also be useful if you want to temporarily add information to memory objects, without passing it back to the database.

sql_no_inner_after_left_join

An optional boolean; if true, a LEFT OUTER JOIN in a multi-steps join specification will force all subsequent joins to be also LEFT. For example in the fake datamodel used in the rest of this documentation, a join like

  HR->join(qw/Employee activities department/)->...

would be interpreted as

  T_Employee LEFT OUTER JOIN T_Activity ON ...
             LEFT OUTER JOIN T_Department ON ...

even if the association betwen Activity and Department is many-to-one (which theoretically would result in a INNER JOIN by default).

table_parent, join_parent

Optional application-specific classes, to be used instead of the builtin classes, as parents for tables and joins declared in this schema.

table_metaclass, join_metaclass, association_metaclass, path_metaclass, type_metaclass

Optional application-specific classes, to be used instead of the builtin metaclasses, for instanciating meta-objects declared in this schema, .

statement_class

Optional application-specific class to be used instead of the builtin class DBIx::DataModel::Statement for instanciating statements.

DATA DEFINITION (TABLES, ASSOCIATIONS, etc.)

This chapter deals with all methods for populating the structure of a meta-schema : defining tables, associations, types, navigation methods, etc. Such objects are defined statically and will not change during program execution.

By contrast, the "SCHEMAS" chapter below deals with properties specific to each schema instance, such as database connection, debugging properties, etc.

Reflection methods for navigating in the meta-schema structure and getting back information about tables, associations, etc. are described in chapter "META-SCHEMA NAVIGATION", at the end of this document; such methods are seldom needed for regular applications, but they can be useful for building tools around DBIx::DataModel, such as CRUD frameworks, editors, translators, etc.

Meta-schema front-end methods

Table()

  $schema_class->Table($class_name, $db_name, @primary_key, \%options);

See "define_table()". The call above is equivalent to

  $meta_schema->define_table(class       => $class_name,
                             db_name     => $db_name,
                             primary_key => \@primary_key,
                             %options);

View()

  $schema->View($class_name, $default_columns, $sql, 
                     \%where, @parent_tables);

This is another front-end method for "define_table()", for creating a new Perl class of name $class_name that represents a SQL SELECT request of shape

  SELECT $columns FROM $db_tables [ WHERE %where ]

The call above is equivalent to

  my @parents = map {$meta_schema->table($_)} @parent_tables;
  $schema->metadm->define_table(class           => $class_name,
                                db_name         => $sql,
                                where           => \%where,
                                default_columns => $default_columns,
                                parents         => \@parents);

The @parent_tables list contains names of Perl table classes from which the view will also inherit. If the SQL code in $sql is a join between several tables, then it is a good idea to mention these tables in @parent_tables, so that their path methods become available to instances of this view. Be careful about table names : the SQL code in $sql should contain database table names, whereas the members of @parent_tables should be Perl table classes.

Perl views as defined here have nothing to do with views declared in the database itself. Perl views are totally unknown to the database, they are just abstractions of SQL statements. If you need to access database views, just use the Table() declaration, like for a regular table.

Association()

  $schema->Association([$class1, $role1, $multiplicity1, @columns1],
                       [$class2, $role2, $multiplicity2, @columns2]);

See "define_association()". The call above is equivalent to

  $schema->metadm->define_association(
    A => {table        => $class1->metadm, role       => $role1, 
          multiplicity => $multiplicity1,  join_cols  => \@columns1},
    B => {table        => $class2->metadm, role       => $role2, 
          multiplicity => $multiplicity2,  join_cols  => \@columns2},
    kind => 'Association',
  );

Composition()

  $schema->Composition([$class1, $role1, $multiplicity1, @columns1],
                       [$class2, $role2, $multiplicity2, @columns2]);

See "define_association()". The call above is equivalent to

  $schema->metadm->define_association(
    A => {table        => $class1->metadm, role       => $role1, 
          multiplicity => $multiplicity1,  join_cols  => \@columns1},
    B => {table        => $class2->metadm, role       => $role2, 
          multiplicity => $multiplicity2,  join_cols  => \@columns2},
    kind => 'Composition',
  );

See also "Special semantics for compositions" below.

Type()

  $schema->Type($type_name => %handlers);

See "define_type()". The call above is equivalent to

  $schema->metadm->define_type(name => $type_name, handlers => \%handlers);

Meta-schema back-end methods

define_table()

  $meta_schema->define_table(%options);

Creates a new Perl class of name $class_name that represents a database table. The new class inherits from the table_parent declared in the schema (by default : DBIx::DataModel::Source::Table). At the same time, an instance of the table_metaclass is also created (by default : DBIx::DataModel::Meta::Source::Table), for holding meta-information about that table (database name, primary key, paths to other tables, etc.).

Returns $meta_schema. Entries in %options are :

class => $string

Name of the class to be created. If $class_name contains no ::, then the schema name is prepended to it (so the new table class is created in the namespace of the schema, which is a recommended best practice).

db_name => $string

SQL name of the database table or view that will be accessed through this class. Actually, it might be any SQL clause, like for example Table1 t1 INNER JOIN Table2 t2 ON t1.fk=t2.pk; see the "View()" method below.

primary_key => $string | \@array

Name of the column (or list of columns) that hold the primary key for that table.

default_columns => $string

Whatever will be injected into SELECT statements, when no -columns parameter is explicitly supplied. The default is '*'.

where => \%hash | \@array

An optional reference to a WHERE condition, in SQL::Abstract::More format. That condition will be automatically injected into every SELECT statement on this table. When working with regular tables, this parameter is always empty; it is only useful for declaring DBIx::DataModel views (see the "View()" method below).

parents => [$parent1, ...]

List of meta-tables from which this table will inherit. This can be useful if your database supports table inheritance (like for example PostgreSQL), and you want to reflect the inheritance structure within the Perl table classes.

column_types

An optional hashref, where keys are type names and values are arrayrefs of columns on which this type should be applied; see "define_column_type()".

column_handlers

An optional hashref, where keys are column names and values are hashrefs of { $handler_name => $handler_body }; see "define_column_handlers()".

Then, when you call $obj->apply_column_handler($handler_name), each column having a handler of the corresponding name will execute the associated code.

auto_insert_columns

A hashref of column names and handlers, like in "define_schema()", but only for one specific table.

auto_update_columns

A hashref of column names and handlers, like in "define_schema()", but only for one specific table.

no_update_columns

A hashref of column names, like in "define_schema()", but only for one specific table.

define_association()

Features common to all kinds of associations

  $meta_schema->define_association(
    name   => $association_name,          # optional
    kind   => $kind,                      # 'Association' or 'Composition'
    A      => {
      table        => $meta_table_instance,
      role         => $role_name,         # optional
      multiplicity => $multiplicity_spec, # ex. "1..*"
      join_cols    => [$col1, ...]        # optional
    },
    B      => { ... },                    # same structure as 'A'
  );

Declares an association between two tables (or even two instances of the same table). The arguments are :

A

A description of the first association end, which is composed of

table

An instance of DBIx::DataModel::Meta::Source::Table.

role

The role name of that source within the association. A Perl method of the same name will be defined in the remote source (the other end of the association). Besides, the role name is also used when building joins through

  $schema->join(qw/FirstTable role1 role2 .../)

One of the role names in the association can be anonymous, which is expressed by undef, by an empty string, or a string containing '0', 'none', or '---'. If anonymous, there will be no Perl method and no possibility to join in that direction, so it defines a unidirectional association. Two anonymous roles in both directions are forbidden (because in that case the association would be useless).

If several associations attempt to define the same role name in the same table, an exception is generated.

multiplicity

The multiplicity specification, i.e. the minimum and maximum number of occurrences of that association end, for any given instance of the other end (if not clear, see UML textbooks).

The multiplicity can be expressed either as an arrayref [$min, $max], or as a string "$min..$max". The $max can be '*' or 'n', which is interpreted as the maximum integer value. If expressed as a string, a mere '*' is interpreted as '0..*', and a mere '1' is interpreted as '1..1'.

Numbers different from 0, 1 or * may be given as multiplicity bounds, but this will be just documentary : technically, all that matters is

  • whether the lower bound is 0 or more (if 0, generated joins will be left joins, otherwise inner joins)

  • whether the upper bound is 1 or more (if 1, the associated method returns a single object, otherwise it returns an arrayref)

When the multiplicity is '*', a method named insert_into_... is installed in the class of the other association end. This method create new objects of the associated class, taking care of the linking automatically. For example :

  $an_employee->insert_into_activities({d_begin => $today,
                                        dpt_id  => $dpt});

This is equivalent to

  $schema->table('Activity')->insert({d_begin => $today,
                                      dpt_id  => $dpt,
                                      emp_id  => $an_employee->{emp_id}});
join_cols

An arrayref of columns that participate in the database join, for this side of the association. The full database join will be built by creating a LEFT|INNER JOIN ... ON .. clause in which the left-hand and right-hand sides of the ON subclause come from the join_cols of both association ends.

This argument is optional: if absent, it will be filled by default by taking the primary key of the table with minimum multiplicity 1, for both sides of the association. This default behaviour is convenient for data models where primary keys and foreign keys are identical (for example dpt_id both as primary key in Department and as foreign key in Activity). Some data models have a different policy, where primary keys are always named the same (for example id), and foreign keys are named after the related table name : in such models, the default does not work, and you have to specify the join columns explicitly.

If the association is many-to-many (i.e. if the maximum multiplicity is greater than 1 on both sides), then join_cols takes a special meaning : it no longer represents database columns, but rather represents two role names (in the sense just defined above) to follow for reaching the remote end of the association. Therefore join_cols must contain exactly 2 items in that case : the path to the intermediate table, and the path from the intermediate table to the remote end. Here is again the example from "SYNOPSIS" in DBIx::DataModel :

  My::Schema->define_association(
    kind => 'Association',
    A    => {
      table        => My::Schema::Department->metadm,
      role         => 'departments',
      multiplicity => '*',
      join_cols    => [qw/activities department/],
    },
    B    => {
      table        => My::Schema::Employee->metadm,
      role         => 'employees',
      multiplicity => '*',
      join_cols    => [qw/activities employee/],
    },
  );
B

A description of the second association end, following exactly the same principles as for the 'A' end.

name

Optional name for the association (otherwise an implicit name will be built by default from the concatenation of the role names).

kind

A string describing the association kind, i.e. one of : Association, Aggregation or Composition. See "Composition()" below for the additional semantics associated with compositions.

The association also creates instances of DBIx::DataModel::Meta::Path for representing the directional paths between those sources.

Only binary associations can be declared; however, it is possible to define methods joining three or more tables : see "define_navigation_method()".

Special semantics for compositions

Compositions are associations with some additional semantics. In UML class diagrams, compositions are pictured with a black diamond on one side : this side will be called the composite class, while the other side will be called the component class. In DBIx::DataModel, the diamond (the composite class) corresponds to the A association end, and the component class corresponds to the B end, so the order is important (while for plain associations the order makes no difference).

The UML intended meaning of a composition is that objects of the component classes cannot exist outside of their composite class. Within DBIx::DataModel, the additional semantics for compositions is to support cascaded insertions and deletions, and auto-expansion; so the special semantics attached to kind Composition is :

  • the multiplicity must be 1-to-n

  • the 'B' end of the association (the "component" part) must not be component of another association (it can only be component of one single composite table).

  • this association can be used for auto-expanding the composite object (i.e. automatically fetching all component parts from the database) -- see "expand" in DBIx::DataModel::Source and "auto_expand"

  • this association can be used for cascaded inserts like

      $source->insert({
        column1 => $val1,
        ...
        $component_name1 => [{$sub_object1}, ...],
        ...
       })

    see "insert"

    The main record will be inserted in the composite class, and within the same transaction, subrecords will be inserted into the component classes, with foreign keys automatically filled with appropriate values.

  • this association can be used for cascaded deletes : the argument to a delete may contain lists of component records to be deleted together with the main record of the composite class.

define_type()

  $meta_schema->define_type(
     name     => $type_name,
     handlers => {
       $handler_name_1 => sub { ... },
       $handler_name_2 => sub { ... },
       ...
       },
   );

This declares a type, which is just a hashref of handler names and handler bodies (coderefs). The type can then be applied to some columns in some tables; this is usually done in the Table declaration (column_types argument), or can be applied later through the "define_column_type" method.

Handlers receive the column value as usual through $_[0]. If the value is to be modified (for example for scalar conversions or while inflating values into objects), the result should be put back into $_[0]. In addition to the column value, other info is passed to the handler :

  $handler_body = sub {
    my ($column_value, $obj, $column_name, $handler_name) = @_;
    my $new_val = $obj->compute_new_val($column_value, ...);
    $column_value = $new_val; # WRONG : will be a no-op
    $_[0] = $new_val;         # OK    : value is converted
  }

The second argument $obj is the object from where $column_value was taken -- most probably an instance of a Table or Join class. Use this if you need to read some contextual information, but avoid modifying $obj : you would most probably get unexpected results, since the collection of available columns may vary from one call to the other.

Other arguments $column_name and $handler_name are obvious.

Handler names from_DB and to_DB have a special meaning : they are called automatically just after reading data from the database, or just before writing into the database. Handler name validate is used by the method "has_invalid_columns()".

The "SYNOPSIS" in DBIx::DataModel shows some examples for types "Date", "Percent", "Multivalue" or "XML".

A Type does not create a Perl class.

define_join()

This method builds or retrieves a pre-existing subclass of DBIx::DataModel::Source::Join, and returns the associated meta-object (an instance of DBIx::DataModel::Meta::Source::Join).

  my $meta_join = $meta_schema->define_join($table, $path1, $path2, ..);

The join starts from a given table class and follows one or several associations through their path names; the resulting SQL request contains parameters automatically inferred from the associations. So for example

  $meta_schema->define_join(qw/Department activities employee/);

is more or less equivalent to

  my $sql = <<_EOSQL_
  Department 
    LEFT OUTER JOIN Activity ON Department.dpt_id = Activity.dpt_id
    LEFT OUTER JOIN Employee ON Activity.emp_id   = Employee.emp_id
  _EOSQL_
  
  $schema->View("Department=>activities=>employee", '*', $sql, 
                qw/Department Activity Employee/);

For each pair of tables, the kind of join is chosen according to the multiplicity declared with that role in the association : if the minimum multiplicity is 0, the join will be LEFT OUTER JOIN; otherwise it will be a usual inner join (exception if $meta_schema->sql_no_inner_after_left_join is true : after a first left join, all remaining tables are also connected through additional left joins). The default kind of join chosen by this rule may be overriden by inserting intermediate connectors in the list, namely '<=>' for inner joins and '=>' for left joins. So for example

  $meta_schema->define_join(qw/Department <=> activities <=> employee/);

becomes equivalent to

  my $sql = <<_EOSQL_
  Department 
    INNER JOIN Activity ON Department.dpt_id = Activity.dpt_id
    INNER JOIN Employee ON Activity.emp_id   = Employee.emp_id
  _EOSQL_

Table names in the SQL statement correspond to what was defined in the $meta_schema->Table(...) declarations. However, tables may be aliased using | as a separator :

  $meta_schema->define_join(qw/Department|dpt activities|act employee|emp/)
              ->select(-columns => qw/dpt.name act.d_begin emp.lastname/,
                       -where   => {"dpt.dpt_name" => {-like => "A%"}});

which generates

  SELECT dpt.name, act.d_begin, emp.lastname
  FROM Department AS dpt
    LEFT OUTER JOIN Activity AS act ON dpt.dpt_id = act.dpt_id
    LEFT OUTER JOIN Employee AS emp ON act.emp_id = emp.emp_id
  WHERE dtp.dpt_name like 'A%'

All tables participating in a join are stacked, and further roles are found by walking up the stack. So in

  ..->define_join(qw/FirstTable path1 path2 path3/)

we must find a path1 in FirstTable, from which we know what will be the Table2. Then, we must find a path2 in in Table2, or otherwise in FirstTable, in order to know Table3. In turn, path3 must be found either in Table3, or in Table2, or in FirstTable, etc. To resolve ambiguities, prefix the path by the name or alias of the targeted source, such as :

  ..->define_join(qw/FirstTable path1|p1 
                     FirstTable.path2 
                     p1.path3|p3 
                     path2.path4/)

The name of the resulting join class will be composed by concatenating table, connectors and path names, including optional aliases. If the same sequence of table and paths was already encountered before, the Perl class already exists, and its corresponding meta-object is returned; otherwise, a new Perl class is created together with its meta-object.

The main purpose of define_join is to gain efficiency in interacting with the database. If we write

  foreach my $dpt (@{$schema->table('Department')->select}) {
    foreach my $act ($dpt->activities) {
      my $emp = $act->employee;
      printf "%s works in %s since %s\n", 
         $emp->{lastname}, $dpt->{dpt_name}, $act->{d_begin};
    }
  }

many database calls are generated behind the scene, in the loops that call the activities and employee methods. Instead we could write

  my $join = $meta_schema->define_join(qw/Department activities employee/);
  foreach my $row (@{$join->select}) {
    printf "%s works in %s since %s\n", 
      $row->{lastname}, $row->{dpt_name}, $row->{d_begin};
  }

which generates one single call to the database.

Meta-table methods

define_navigation_method()

  $meta_table->define_navigation_method(
    $meth_name => qw/role1 role2 .../, 
    \%optional_select_args,
  );

Inserts into $meta_source->class a new method named $meth_name, that will automatically call "define_join()" and then select(), passing %optional_select_args to the select call. This is useful for joining several tables at once, so for example with

  $meta_schema->table('Department')->define_navigation_method(
    employees => qw/activities employee/
  );

we can then write

  my $empl_ref = $some_dept->employees(-where   => {gender => 'F'},
                                       -columns => [qw/firstname lastname]);

This method is used internally to implement many-to-many associations; so if you have only two roles to follow, you would probably be better off by defining a many-to-many association, which is a more abstract notion. Direct calls to define_navigation_method are still useful if you want to follow three or more roles at once.

The last argument to define_navigation_method is an optional hashref; if present, the hash will be passed as initial argument to the select call.

define_column_type()

  $meta_table->define_column_type($type_name, @column_names);

Registers type $type_name to be applied to columns with name in @column_names, within the target $meta_table.

define_column_handlers()

  $meta_table->define_column_handlers($column_name, 
    $handler_name_1 => sub { ... },
    ...
  );

Registers handlers to be applied to $column_name, within the target $meta_table. The main purpose of this method is for implementing the higher-level "define_column_type()" method; however it can also be called directly, without the need for defining a type.

If another handler is already installed within $column_name under name $handler_name_1, then both subs are automatically composed into a new sub performing both coderefs. The execution order will correspond to the handler declaration order, except for the special case from_DB, where the last declared handler is executed first.

define_auto_expand()

  $meta_table->define_auto_expand(@component_names);

Generates an "auto_expand()" method for the class, that will autoexpand on the roles listed (i.e. will call the appropriate method and store the result in a local slot within the object). In other words, the object knows how to expand itself, fetching information from associated tables, in order to build a data tree in memory. Only component names declared through Composition() may be auto-expanded.

Be careful about performance issues: when an object uses auto-expansion through a call to "auto_expand()", every auto-expanded role will generate an additional call to the database. This might be slow, especially if there are recursive auto-expansions; so in some cases it will be more appropriate to flatten the tree and use database joins, typically through the join() method.

SCHEMAS

A $schema is an instance of a subclass of DBIx::DataModel::Schema. The subclass holds a reference to a $meta_schema where all information about tables, associations, etc. is kept. Each instance of that subclass holds information about the database connection, the debugging status, etc.

A schema subclass can have several instances; in that case, the application runs in multi-schema mode. However, multi-schema mode is only useful in some special situations, like for example when transferring data between several databases; in most common cases, a single-schema mode is enough.

Single-schema mode is activated by default, which means that all method calls can be performed directly on the subclass; the subclass will manage a singleton instance, and will delegate calls to that singleton. This is the recommended way to work with DBIx::DataModel, because it uses less memory, and simplifies the application code (there is no need to pass a $schema reference around between all modules).

If you need it, multi-schema mode is activated by calling the new() method, as many times as you need. Once this mode is activated, it is not possible to go back to single-schema mode. Furthermore, multi-schema mode should be activated before the singleton has been created, i.e. before any call to the class methods below.

Because of this duality, all methods below, described as $schema->some_method(...), can be called either as class methods (single-schema mode), or as instance methods (multi-schema mode).

Creating and manipulating schema state

Schema::new()

  my $schema = $schema_subclass->new(%options);

As just explained, this activates multi-schema mode, and returns a new schema instance. %options may contain some initial settings for dbh, debug, etc.; this is the same as creating the schema first, and then calling the setter methods below $schema->dbh(...), $schema->debug(...), etc.

singleton()

  my $schema = $schema_subclass->singleton;

When in single-schema mode, returns the singleton instance associated with this $schema_subclass. When in multi-schema mode (i.e. after Schema::new() has been called at least once); raises an error.

Client code rarely needs to call singleton() explicitly; when in single-schema mode, method calls are usually expressed as class methods, and the singleton is retrieved automatically.

dbh()

  my $dbh = DBI::connect(...);
  $schema->dbh($dbh, %options);        # set
  $schema->dbh([$dbh, %options]);      # set, alternative syntax

  my $dbh             = $schema->dbh;  # get back just the dbh
  my ($dbh, %options) = $schema->dbh;  # get back all

Returns or sets the handle to a DBI database handle (see DBI). The $dbh handle must have its RaiseError property set to a true value.

In %options you may pass any key-value pairs, and retrieve them later by calling dbh in a list context, which may be useful for holding driver-specific information. DBIx::DataModel will only inspect those options for knowing how to retrieve database-generated keys (see the "insert" method below).

Calls to the dbh method for changing the database handle are allowed at any time, except when a transaction is in course. However, a nested transaction may temporarily change the database handle by supplying it as argument to the "do_transaction" method.

To unset the database handle, call $schema->dbh(undef).

debug()

  $schema->debug(1);             # will warn for each SQL statement
  $schema->debug($debug_object); # will call $debug_object->debug($sql)
  $schema->debug(undef);         # turn off debugging

Debug mode is useful for seeing SQL statements generated by DBIx::DataModel. Enabling debugging with a $debug_object will typically be useful in conjunction with something like Log::Log4perl or Log::Dispatch. Otherwise, enabling debugging with just any true scalar will print SQL statements on STDERR.

There is also another way to see the SQL code for one particular statement :

  my $spy_sql = sub {my ($sql, @bind) = @_;
                     print STDERR join "\n", $sql, @bind;
                     return ($sql, @bind);};
  
  my $result = $source->select(-columns   => \@columns,
                               -where     => \%criteria,
                               -post_SQL  => $spy_sql);

sql_abstract()

  $schema->sql_abstract($an_SQL_Abstract_More_instance); # set
  my $sqlam = $schema->sql_abstract;                     # get

Sets or retrieves the instance of SQL::Abstract::More used by this $schema. If the client code does not set it explictly, an instance wil be implicitly created.

dbi_prepare_method()

  $schema->dbi_prepare_method($method);      # set
  my $method = $schema->dbi_prepare_method;  # get

Sets or retrieves the method sent to DBI for preparing SQL statements. The default is "prepare"; it can be set to "prepare_cached" instead.

placeholder_prefix()

  $schema->placeholder_prefix($prefix);      # set
  my $prefix = $schema->placeholder_prefix;  # get

Sets or retrieves the prefix string to recognize "named placeholders" within a statement. That prefix should never match any regular data encountered in your application; the default is '?:'.

autolimit_firstrow()

  $schema->autolimit_firstrow($bolean);                      # set
  my $has_autolimit_firstrow = $schema->autolimit_firstrow;  # get

If this flag is true, calls to select(..., -result_as => 'firstrow') will automatically add -limit => 1.

select_implicitly_for()

  $schema->select_implicitly_for($string);      # set
  my $string = $schema->select_implicitly_for;  # get

Sets or retrieves a default value for the -for argument to select().

This default value is not set for statements with -result_as => 'subquery' (because the FOR clause in an SQL statement only makes sense at the top level, not in a subquery).

localize_state()

  {
    my $scope_guard = $schema->localize_state(@schema_members_to_localize);

    ... # do some work, possibly change state
  
  } # $scope_guard out of scope : previous state of $schema is restored 

Applies dynamic scoping to a $schema, like Perl's local construct (see "Temporary Values via local" in perlsub). Here however, since local is not perfectly suited for private class data, we provide a solution which is similar in spirit, but slightly different in implementation. The localize_state method internally takes a copy of the current state, and returns a handle to it. The handle should be stored in a my variable; when that variable goes out of scope (at block exit), then the previous state is restored.

The optional argument @schema_members_to_localize specifies precisely which schema members should be localized. When called with an empty list, the default is : dbh, dbh_options, debug, select_implicitly_for, dbi_prepare_method.

Creating "connected sources"

Data is stored in data sources that are accessed through a given schema; therefore, before issuing data manipulation statements, we need to establish a relationship between a data source and a schema : this is the job of the two methods below. Both create a DBIx::DataModel::ConnectedSource instance. That object can then perform data selection, update or deletion, as described in the next chapter.

Schema::table()

  my $source = $schema->table($table_name);
  my $result = $source->select(...); # or ->insert(..), ->update(..), etc.

This is the recommended way to reach a table from a schema. The $table_name is the Perl name that was declared in "define_table()", not the database name.

The method name is (intentionally!) misleading : instead of returning a table object or meta_table object, it returns a connected_source; however that object delegates many methods to the table and therefore can mimic it (actually, it works like a kind of proxy).

When in single-schema mode, the table() call can be replaced by a direct call to the table class: so for example, if we have a schema class HR containing a table Employee, the following calls are equivalent :

  $result = HR::Employee->select(...)
  $result = HR->table('Employee')->select(...)
  $result = HR->singleton->table('Employee')->select(...)

In the first syntax above, the HR::Employee subclass will implicitly reach for its schema singleton and for its meta-table, and will implicitly create a statement.

When in multi-schema mode, the only way is through the table() method :

  my $schema1 = HR->new(dbh => dbh1);
  my $schema2 = HR->new(dbh => dbh2);
  $result1 = $schema1->table('Employee')->select(...);
  $result2 = $schema2->table('Employee')->select(...);

This Schema::table() method should not be confused with the meta-schema navigation method "Meta::Schema::table()", described at the end of this document.

Schema::join()

  my $source = $schema->join($table, $path1, $path2, ...);
  my $result = $source->select(...);

This method calls "define_join()" to define or retrieve a join subclass; then it creates a fresh connected_source related to that join datasource and to the $schema. From that object, data can be retrieved through the "select()" method.

This Schema::join() method should not be confused with the meta-schema navigation method "Meta::Schema::join()", described at the end of this document, which is just an accessor method.

Besides, a derivative of the present method is defined in "Table::join()".

Transactions

do_transaction()

  my $coderef = sub {$table1->insert(...); $table2->update(...); ...};
  $schema->do_transaction($coderef);

Evaluates the code within a transaction. In case of failure, the transaction is rolled back, and an exception is raised, with a message containing the initial error and the status of the rollback (because the rollback itself may also fail). If you need finer details, you can treat the exception as an object with two methods initial_error and rollback_errors:

    eval {$schema->do_transaction($coderef); 1}
    or do {my $err = $@;
           explain($err->initial_error, $err->rollback_errors)};

Usually the coderef passed as argument will be a closure that may refer to variables local to the environment where the closure was created.

Nested calls to do_transaction are supported : only the top-level call will actually initiate and then commit the transaction, while an exception at any level will abort and rollback the whole thing.

If the nested transaction needs to temporarily work on a different database handle, it may do so by supplying the dbh and its options as additional arguments :

  $schema->do_transaction($coderef, $new_dbh, %new_dbh_options);

When called in this form, do_transaction will temporarily set the dbh to the supplied value, and then return to the previous value when the nested transaction is finished. Commits on all involved database handles are delayed until the top-level transaction is finished.

There is no support (yet!) for nested transactions with intermediate savepoints.

If the dbh is an instance of DBIx::RetryOverDisconnects, do_transaction() behaves like "txn_do" in DBIx::RetryOverDisconnects, i.e. it will inspect if the failure was due to a disconnection, and in that case will try again, as many times as required in the TxnRetries attribute.

Schema utility methods

Schema::metadm()

Returns the meta-schema instance associated with the invocant.

unbless()

  $schema->unbless($obj1, $obj2, ...);

Recursively applies "damn" in Acme::Damn to remove all class information from its arguments : these become plain Perl hashrefs, arrayrefs or scalars.

This may be useful if the datatree returned by a select(..) / expand(..) needs to be exported through an external module that only considers plain datastructures; this is the case for example with "Dump" in YAML::Syck.

Schema class methods, delegated to the meta-schema

Front-end data definition methods listed in "META-SCHEMA DEFINITION" can be called on the schema class, instead of the meta-schema; in other words, the following methods can be called on the schema class :

"Table()"
"View()"
"Association()"
"Composition()"
"Type()"

CONNECTED SOURCES

Principle

A connected source is basically just a pair of references to a $meta_source and to a $schema. Such objects are created explicitly through methods "Schema::table()" or "Schema::join()"; they are also created implicitly when, in single-schema mode, class methods are called on table classes. The constructor is not documented because client code is not expected to call it directly; instead, client code should use the Schema methods just mentioned.

Accessors

schema()

Accessor for the connected source's schema.

source()

Accessor for the connected source's meta-source.

metadm()

Synonym for source(). This is for convenience, because in something like

  my $meta_table = $schema->table($table_name)->metadm;

one naturally expects to get the meta-object associated with table $table_name, forgetting the fact that the subexpression <$schema-table($table_name) >> is a connected source and not a table!

Data retrieval

ConnectedSource::select()

Proxy method to Statement::select()

fetch()

  my $record = $connected_source->fetch(@key_values, \%options);

Fetches a single record, from its primary key value (on one or several columns). This method is just syntactic sugar for

  my $record = $connected_source->select(-fetch => \@key_values, %options);

(see below). %options may specify arguments to "select()", like -for, -pre_exec, -post_exec, etc.

fetch_cached()

  my $record = $connected_source->fetch_cached(@key_values, \%options);

Like fetch, except that the result is stored in a cache, and further calls to the same methods with the same parameters will return the cached record instead of going back to the database. The cache does not know about any updates to the database, so this is mainly useful for readonly data.

The cache is stored internally in $meta_source->{fetch_cached}{$dbh_addr}{$freeze_args} (where $dbh_addr is Scalar::Util::refaddr(MyTable->dbh) and $freeze_args is Storable::freeze(@keyValues, \%options)). If needed, client code may use this information to clear the cache or tie it to a more sophisticated caching module.

ConnectedSource::join()

  my $join_stmt = $connected_source->join(qw/path1 path2 .../);
  $join_stmt->prepare();

Returns a statement that will select a collection of data rows from tables associated with the current meta-source, performing the appropriate joins. Internally this is implemented throught the /define_join() method, with an additional -where criteria to constrain on the primary key(s) of the meta-source. That statement cannot be executed yet, because the values of the primary key are not known until we have an row of that source; but the statement can already be prepared. Later on, we can execute the statement by binding it to an instance of the $connected_source :

   my $obj          = $connected_source->fetch(...);
   my $related_rows = $join_stmt->execute($obj)->all;

This is especially useful for loop efficiency :

  my $join_stmt = $connected_source->join(qw/path1 path2 .../);
  $join_stmt->prepare();
  my $list = $connected_source->select(...);
  foreach my $row (@$list) {
    my $related_rows = $join_stmt->execute($row)->all;
    # ... work with $related_rows
  } 

Data manipulation

insert()

  my @ids = $connected_source->insert(
    {col1 => $val1, col2 => $val2, ...}, 
    {...},
    %options,
  );
  # or
  my @ids = $connected_source->insert(
    [qw/ col1   col2   .../],
    [    $val1, $val2, ... ],
    ...
  );

Inserts a collection of rows into the database, given either as a list of hashrefs, or as a first arrayref containing the column names, followed by a list of arrayrefs containing values for each row to be inserted.

In either form, the method applies the to_DB handlers, removes the no_update columns, and then inserts the new records into the database.

Primary key column(s) should of course be present in the supplied hashrefs, unless the the key is auto-generated by the database (see below).

Each hashref will be blessed into the $table class, and will be inserted through the internal _singleInsert() method. The default implementation of this method should be good enough for most common uses, but you may want to refine it in your table classes if you need some fancy handling on primary keys (like for example computing a random key and checking whether that key is free; see DBIx::DataModel::Doc::Internals for an example). The default implementation uses the following algorithm to retrieve keys auto-generated by the database :

  • if a dbh option called returning_through is found (see options passed to the "dbh" method), the method automatically adds a -returning clause for retrieving value(s) from the primary key column(s). The way to retrieve such values depends on returning_through: when 'FETCH', it performs an additional call to $sth->fetchrow_array; when 'INOUT', it binds inout parameters into the statement. When setting the database handle through the "dbh" method, the returning_through option is automatically set to 'FETCH' if it is a Pg driver, or automatically set to 'INOUT' if it is an Oracle driver.

  • if a dbh option called last_insert_id is found, this is taken as a callback function, which gets called as

      $dbh_options{last_insert_id}->($dbh, $table_name, $column_name)
  • if dbh options called catalog and/or schema are found, DBIx::DataModel will call

      $dbh->last_insert_id($dbh_options{catalog}, $dbh_options{schema},
                           $table_name, $column_name)
  • otherwise, DBIx::DataModel will call

      $dbh->last_insert_id(undef, undef, undef, undef)

Cascaded insert

If the table is a composite class (see Composition() above), then the component parts may be supplied within the hashref, with keys equal to the role names, and values given as arrayrefs of sub-hashrefs; then these will be inserted into the database, at the same time as the main record, with join values automatically filled in (cascaded insert). For example :

   HR::Employee->insert({firstname  => "Johann Sebastian",  
                         lastname   => "Bach",
                         activities => [{d_begin  => '01.01.1695',
                                         d_end    => '18.07.1750',
                                         dpt_code => 'CPT'}]});

Insert options

The insert() call may take a list of %options specified at the end of the argument list (notice they are not given as a hashref, but as a mere hash, or list of pairs). Actually the only option currently supported is -returning, :

  • if the -returning option is set to an empty hashref, the return value will be a list of hashrefs (one for each inserted record), containing the column name(s) and value(s) of the primary key for that record, and possibly containing subhashes or subarrays for other records created through cascaded inserts. For example:

      my @result = HR->table('Employee'>
                     ->insert({..., activities => [{...}, ...]},
                              ...,
                              -returning => {});
      my $prim_key_first_emp = $result[0]{emp_id};
      my $prim_key_first_act = $result[0]{activities}[0]{act_id};
  • if the -returning option is set to any other value, that value is passed to "insert" in SQL::Abstract::More and finally to the SQL level (INSERT ... RETURNING ...); whatever is returned from the database for each single record gets flattened into a single list transmitted back to the caller.

      my @result = $statement->insert({...}, ..., 
                                      -returning => $scalar_or_arrayref);
  • if the -returning option is absent, values returned by calls to _singleInsert() are collected into a flattened array, and then returned by insert(); usually, these are the primary keys of the inserted records. If this array contains several values, and insert() was called from a scalar context, a warning is issued.

Reftype checking

If a record contains columns that are arrayrefs or hashrefs, and these are not known as "component parts" (see "Cascaded insert" above), then a warning is generated and these columns are automatically removed from the record.

An exception to this rule is when the SQL::Abstract instance associated with the schema has the option array_datatypes => 1 : in that case, columns with arrayrefs are passed as-is to the SQL::Abstract::More and DBI layers, under the assumption that the DBD driver will take appropriate action on those datatypes.

Another exception is when the value is of shape [$orig_value, \%datataype], which is interpreted as a value together with an SQL datatype; again this is passed to the SQL::Abstract::More layer.

update()

  $connected_source->update(-set   => {col1 => $val1, ...},
                            -where => \%condition);
  $connected_source->update({col1 => $val1, ...});
  $connected_source->update(@primary_key, {col1 => $val1, ...});

Generates a request to the database to update one or several records. The API for this method accepts several syntaxes :

  • the first syntax with -set and -where keywords closely reflects the SQL syntax of shape

      UPDATE table SET col1='val1', col2='val2', ... WHERE ...

    This is mostly used for updating several records simultaneously (bulk update).

  • the second syntax is used for updating a single record, passed as a hashref; it does not matter if this hashref is blessed or not, because a copy will be blessed into the class of the invocant. A -where clause will be automatically generated by extracting the primary key column(s) from the record; then the remaining columns are treated as the columns to update.

  • the third syntax with @primary_key is an alternate way to supply the values for the primary key (it may be more convenient because you don't need to repeat the name of primary key columns). So if emp_id is the primary key of table Employee, then the following are equivalent :

      HR->table('Employee')->update({emp_id  => $eid, 
                                     address => $new_addr, 
                                     phone   => $new_phone});
      HR->table('Employee')->update($eid => {address => $new_addr, 
                                             phone   => $new_phone});

Yet another syntax is to directly call update() on a data row, without any argument : see "Table::update()" below.

Before calling the database, the to_DB handlers are applied, the no_update_columns are removed, and the auto_update_columns are inserted.

The update method only updates the columns received as arguments : it knows nothing about other columns that may sit in the database. Therefore if you have two concurrent clients doing

  (client1)  ...->update($id, {c1 => $v1, c2 => $v2});
  (client2)  ...->update($id, {c3 => $v3, c4 => $v4, c5 => $v5});

the final state of record $id in the database is guaranteed to reflect changes from both clients, because the sets ('c1', 'c2') and ('c3', 'c4', 'c5') of updated columns are disjoint.

Reftype checking

Like for inserts, columns with arrayrefs or hashrefs are automatically removed from the update list (with a warning), except some special cases as described above for insert().

delete()

  $connected_source->delete(-where => \%condition);
  $connected_source->delete({col1 => $val1, ...});
  $connected_source->delete(@primary_key);

Generates a request to the database to delete one or several records. The API for this method accepts several syntaxes :

  • the first syntax with the -where keyword closely reflects the SQL syntax of shape

      DELETE FROM table WHERE ...

    This is mostly used for deleting several records simultaneously (bulk delete).

  • the second syntax is used for deleting a single record. A -where clause will be automatically generated by extracting the primary key column(s) from the record.

    If the source is a composite class (see Composition() above), and if the record contains references to lists of component parts, then those will be recursively deleted together with the main object (cascaded delete). However, if there are other component parts in the database, not referenced in the hashref, then those will not be automatically deleted : in other words, the delete method does not go by itself to the database to find all component parts (this is the job of the client code, or sometimes of the database itself).

  • the third syntax with @primary_key is an alternate way to supply the values for the primary key (it may be more convenient because you don't need to repeat the name of primary key columns). Note that $statement->delete(11, 22) does not mean "delete records with keys 11 and 22", but rather "delete record having primary key (11, 22)"; in other words, with this syntax you only delete one record at a time.

Yet another syntax is to directly call delete() on a data row, without any argument : see "Table::delete()" below.

Utility methods

ConnectedSource::bless_from_DB()

Proxy method to Statement::bless_from_DB().

STATEMENTS

A statement object encapsulates a SELECT request to the database. It starts by assembling various pieces of information : the datasource to query, the database connection, the various clauses for generating the SQL request. Once everything is ready, the database request is issued, and the results are collected. These steps may occur either all at once (which is the most frequent case), or through several calls to intermediate methods; see for example the methods "refine()", "prepare()", "execute()" below.

By default, statements are created as instances of DBIx::DataModel::Statement; but the schema may define another "statement_class()" for application-specific purposes : for instance the DBIx::DataModel::Statement::JDBC subclass in this distribution adds some methods specific to DBD::JDBC data sources, allowing to call JDBC methods getMemberCount(), absolute(), relative(), etc.

Data selection

select()

  $result = $statement->select(
     -columns       => \@columns, 
       # OR : -columns => [-DISTINCT => @columns],
     -where         => \%where_criteria,
       # OR : -fetch => $key,
       # OR : -fetch => \@key,
     -where_on      => \%where_on_criteria,
     -group_by      => \@groupings,
     -having        => \%having_criteria,
     -order_by      => \@order,
     -for           => $purpose,
     -post_SQL      => sub {...},
     -pre_exec      => sub {...},
     -post_exec     => sub {...},
     -post_bless    => sub {...},
     -prepare_attrs => \%attrs,
     -limit         => $limit,
     -offset        => $offset,
     -page_size     => $page_size,
     -page_index    => $page_index,
     -column_types  => \%column_types,
     -result_as     => 'rows'      || 'firstrow' 
                    || 'hashref'   || [hashref => @cols]
                    || 'sth'       || 'sql' 
                    || 'subquery'  || 'flat_arrayref' 
                    || 'statement' || 'fast_statement'
  );
  

Calls "refine()" to process arguments (if any), and returns a result as specified by the -result_as argument (see below). If necessary, intermediate steps are automatically performed, like calling "sqlize()", "prepare()" and "execute()".

Arguments are all optional and are passed by name :

-columns => \@columns

\@columns is a reference to an array of SQL column specifications (i.e. column names, * or table.*, functions, etc.).

Initial words in @columns that start with a hyphen are treated as verbatim SQL : in particular, -columns => [-DISTINCT => qw/col1 col2 .../] will yield

  SELECT DISTINCT col1, col2, ... FROM ...

A '|' in a column is translated into an 'AS' clause, according to the current SQL dialect in SQL::Abstract::More : this is convenient when using perl qw/.../ operator for columns, as in

  -columns => [ qw/table1.longColumn|t1lc table2.longColumn|t2lc/ ]

Column aliasing should be avoided on key columns (either primary or foreign keys), because role methods will no longer be able to navigate through the joins (currently DBIx::DataModel is not clever enough to rename its internal join constraints according to column aliases). Aliasing on non-key columns is OK, and column handlers will operate properly on aliased columns.

The argument to -columns can also be a string instead of an arrayref, like for example "c1 AS foobar, MAX(c2) AS m_c2, COUNT(c3) AS n_c3"; however this is mainly for backwards compatibility. The recommended way is to use the arrayref notation as explained above :

  -columns => [ qw/  c1|foobar   MAX(c2)|m_c2   COUNT(c3)|n_c3  / ]

If omitted, \@columns takes the default, which is usually '*', unless modified through the default_columns argument to "define_table()".

No verification is done on the list of retrieved \@columns; in particular, the list must not always include primary or foreign keys --- but if this is the case, later attempts to perform joins or updates will obviously fail.

-where => \%where_criteria

\%where_criteria is a reference to a hash or array of criteria that will be translated into SQL clauses. In most cases, this will just be something like {col1 => 'val1', col2 => 'val2'}; see SQL::Abstract::select for a detailed description of the structure of that hash or array. It can also be a plain SQL string like "col1 IN (3, 5, 7, 11) OR col2 IS NOT NULL".

-fetch => \@columns

equivalent to

  ...->select(-where => {<primary_key_column0> => $columns[0],
                         <primary_key_column1> => $columns[1], ...},
              -result_as => "firstrow")

If the primary key ranges on one single column (which is the most frequent case), then the argument to -fetch can also be a single scalar value :

  ...->select(-fetch => $key)

When calling a table directly, the select(-fetch => ..) syntax is awkward; you will most certainly prefer the syntactic sugar offered by the fetch() method :

  $table->fetch(@key)

However, the -fetch notation is useful when walking through association roles :

  $employee->activities(-fetch => $act_id)

This example will generate the following SQL

  SELECT * FROM activity WHERE act_id=$act_id AND emp_id=$employee->{emp_id}

Notice how this is different from

  $schema->table('Activity')->fetch($act_id)

which would generate

  SELECT * FROM activity WHERE act_id=$act_id

Both examples would end up with the same record, but in the first case there is an additional check that this record really belongs to the given employee.

In presence of -fetch, arguments -where and -select_as are not allowed.

-where_on => \%where_on_criteria
  $schema->join(qw/Employee => activities => department/)->select(
    -where => {firstname => 'Hector',
               dpt_name  => 'Music'},
    -where_on => {
       T_Activity   => {d_end => {"<" => '01.01.2001'}},
       T_Department => {dpt_head => 999},
     },
  );

Adds some additional criteria into the JOIN ... ON ... SQL statement. This feature is still experimental. These criteria are expressed as a hashref; keys of that hashref are names of database tables participating in the join. The criteria will be combined with regular foreign key constraints produced by the join; so in the (silly) example above, the generated SQL is :

     SELECT * FROM T_Employee 
       LEFT OUTER JOIN T_Activity
         ON T_Employee.emp_id = T_Activity.emp_id AND d_end < ?
       LEFT OUTER JOIN T_Department 
         ON T_Activity.dpt_id = T_Department.dpt_id AND dpt_head = ?
       WHERE dpt_name = ? AND firstname = ?
-group_by => "string" or -group_by => \@array

adds a GROUP BY clause in the SQL statement. Grouping columns are specified either by a plain string or by an array of strings.

-having => "string" or -having => \%criteria

adds a HAVING clause in the SQL statement (only makes sense together with a GROUP BY clause). This is like a -where clause, except that the criteria are applied after grouping has occured.

-order_by => \@order

\@order is a reference to a list of columns for sorting. It can also be a plain SQL string like "col1 DESC, col3, col2 DESC". Columns can also be prefixed by '+' or '-' for indicating sorting directions, so for example -orderBy => [qw/-col1 +col2 -col3/] will generate the SQL clause ORDER BY col1 DESC, col2 ASC, col3 DESC.

-for => $clause

specifies an additional clause to be added at the end of the SQL statement, like -for => 'read only' or -for => 'update'.

-post_SQL => sub{...}

hook for specifying a callback function to be called on SQL code and

bind values, before preparing the statement. It will be called as follows:

  ($sql, @bind) = $args->{-post_SQL}->($sql, @bind) if $args->{-post_SQL};
-pre_exec => sub{...}, -post_exec => sub{...}

hooks for specifying callback functions to be called on the DBI statement handle, just before or just after invoking execute(). So the sequence will be more or less like this:

  $sth = $dbh->prepare($sql_statement);
  $pre_exec_callback->($sth)  if $pre_exec_callback;
  $sth->execute(@bind_values);
  $post_exec_callback->($sth) if $post_exec_callback;

This is mostly useful if you need to call driver-specific functions at those stages.

-post_bless => sub{...}

hook for specifying a callback function to be called on data rows. The callback will be called after bless_from_DB, i.e. the row is already an object of the proper class and column handlers have been applied.

-prepare_attrs => \%attrs

Optional attributes that will be transmitted to "prepare" in DBI.

-page_size => $page_size

specifies how many rows will be retrieved per "page" of data. Default is unlimited (or more precisely the maximum value of a short integer on your system). When specified, automatically implies -limit.

-page_index => $page_index

specifies the page number (starting at 1). Default is 1. When specified, automatically implies -offset.

-limit => $limit

limit to the number of rows that will be retrieved. Automatically implied by -page_size.

-offset => $offset

Automatically implied by -page_index.

-column_types => \%column_types

dynamically specifies some column types at the statement level. Usually column types are defined within tables at compile time, but it may be necessary to defined additional types within the statement, for example when using database functions and/or aliases. The argument is a hashref in which keys are names of column types defined within the schema, and values are arrayrefs of column names :

   select(-columns     => [qw/ MAX(date_col)|max_date 
                               MIN(date_col)|min_date ... /],
          -column_types => { Date => [qw/max_date min_date/] },
          ...)
-dbi_prepare_method => $method_name

overrides the "dbi_prepare_method()" specified at the schema level.

-result_as => $result_kind

specifies what kind of result will be produced. The result is always a scalar and does not depend on the calling context. Possible result kinds are :

rows

The result will be a ref to an array of rows, blessed into objects of the class. This is the default result kind. If there are no data rows, a ref to an empty list is returned.

firstrow

The result will be just the first data row, blessed into an object of the class. If there is no data, undef is returned.

hashref || [hashref => @cols]

The result will be a hashref. Keys in the hash correspond to distinct values of the specified columns, and values are data row objects. If the argument is given as [hashref => @cols], the column(s) are specified by the caller; otherwise if the argument is given as a simple string, @cols will default to $source->primary_key. If there is more than one column, the result will be a tree of nested hashes. This -result_as is normally used only where the key fields values for each row are unique. If multiple rows are returned with the same values for the key fields then later rows overwrite earlier ones.

flat_arrayref

The result will be a ref to an array that concatenates results from each row. Usually this is combined with a -columns argument with one single column, to get a vertical slice from a resultset, like in

  my $all_names = $schema->table('People')->select(
    -columns   => [-DISTINCT => qw/firstname/],
    -result_As => 'flat_arrayref',
   );
  print sort @$all_names;

However, it may also be used for example to fill a hash from pairs retrieved from each row, like in

  my $pairs = $schema->table('People')->select(
    -columns   => [qw/pers_id firstname/],
    -result_as => 'flat_arrayref'
   );
  my %hash = @$pairs;

Finally, it can be convenient for avoiding column aliases, when using aggregator functions :

  my $array_ref = $source->select(-columns  => [qw/MAX(col1)
                                                  AVG(col2)
                                                  COUNT(DISTINCT(col3))/],
                                  -where     => ...,
                                  -result_as => 'flat_arrayref');
  my ($max_col1, $avg_col2, $count_col3) = @$array_ref;
statement

Returns the statement itself; data rows will be retrieved later, through the "next()" or "all()" methods. A typical usage pattern is :

  my $statement = $schema->table($table_name)->select(
    -where     => \%criteria, 
    -result_as => 'statement',
  );
  while (my $row = $statement->next) {
    do_something_with($row);
  }
fast_statement

The result is like a normal statement, except that rows will be successively fetched into the same memory location, using DBI's fetch and bind_columns methods. This is the fastest way to get data; however, pay attention to the following warning from DBI's documentation : Note that the same array reference is returned for each fetch, so don't store the reference and then use it after a later fetch. Also, the elements of the array are also reused for each row, so take care if you want to take a reference to an element.

Since each row must be retrieved separately, call all() or next(10) on a fast statement make no sense and therefore generate an exception.

sth

The result will be an executed DBI statement handle. Then it is up to the caller to retrieve data rows using the DBI API. If needed, these rows can be later blessed into appropriate objects through bless_from_DB().

sql

In scalar context, the result will just be the generated SQL statement. In list context, it will be ($sql, @bind), i.e. the SQL statement together with the bind values.

subquery

Returns a ref to an arrayref containing \["($sql)", @bind]. This is meant to be passed to a second query through SQL::Abstract, as in :

  my $subquery = $source1->select(..., -result_as => 'subquery');
  my $rows     = $source2->select(
      -columns => ...,
      -where   => {foo => 123, bar => {-not_in => $subquery}}
   );

Retrieving data rows

next()

  while (my $row = $statement->next) {...}

  my $slice_arrayref = $statement->next(10);

If called without argument, returns the next data row, or undef if there are no more data rows. If called with a numeric argument, attempts to retrieve that number of rows, and returns an arrayref; the size of the array may be smaller than required, if there were no more data rows in the database. The numeric argument is forbidden when select() was called with -result_as => 'fast_statement', because in that case rows must be retrieved one at a time.

Each row is blessed into an object of the proper class, and is passed to the -post_bless callback (if applicable).

all()

  my $rows = $statement->all;

Similar to the next method, but returns an arrayref containing all remaining rows. This method is forbidden when select() was called with -result_as => 'fast_statement', because in that case rows must be retrieved one at a time.

row_count()

Returns the number of rows corresponding to the current executed statement. Raises an exception if the statement is not in state "executed".

The default implementation for counting rows involves an additional call to the database (SELECT COUNT(*) FROM ...); but a Statement subclass may override this method if the database driver provides a better way (for example DBIx::DataModel::Statement::JDBC calls the Java method getMemberCount).

row_num()

Returns the index number of the next row to be fetched (starting at $self->offset, or 0 by default).

Pagination

page_size()

Returns the page size (requested number of rows), as it was set through the -page_size argument to refine() or select().

page_index()

Returns the current page index (starting at 1). Always returns 1 if no pagination is activated (no -page_size argument was provided).

offset()

Returns the current requested row offset (starting at 0). This offset changes when a request is made to go to another page; but it does not change when retrieving successive rows through the "next" method.

page_count()

Calls "row_count()" to get the total number of rows for the current statement, and then computes the total number of pages.

goto_page()

  $statement->goto_page($page_index);

Goes to the beginning of the specified page; usually this involves a new call to "execute", unless the current statement has methods to scroll through the result set (see for example DBIx::DataModel::Statement::JDBC).

Like for Perl arrays, a negative index is interpreted as going backwards from the last page.

shift_pages()

  $statement->shift_pages($delta);

Goes to the beginning of the page corresponding to the current page index + $delta.

next_page()

A shortcut for $statement->shift_pages(1).

page_boundaries()

  my ($first, $last) = $statement->page_boundaries;

Returns the indices of first and last rows on the current page. These numbers are given in "user coordinates", i.e. starting at 1, not 0 : so if -page_size is 10 and -page_index is 3, the boundaries are 21 / 30, while technically the current offset is 20. On the last page, the $last index corresponds to row_count (so $last - $first is not always equal to pageSize + 1).

page_rows()

Returns an arrayref of rows corresponding to the current page (maximum -page_size rows).

Dealing with the statement lifecycle

status()

Returns the current status or the statement. This is a dualvar with a string component (new, sqlized, prepared, executed) and an integer component (1, 2, 3, 4).

refine()

  $statement->refine(%args);

Set up some named parameters on the statement, that will be used later when calling "sqlize()", "prepare()" and "execute()". Admitted parameters are listed under the "select()" method.

For most parameters, when refine() is called several times on the same statement, the latest value takes precedence (like the latest update in a Perl hash) : this is the case for instance for parameters -columns, -order_by, etc. However, for the -where parameter, the statement accumulates all successive conditions, combining them with an implicit "AND". So for example

   $statement->refine(-where => {col1 => $val1, col2 => {">" => $min}})
             ->refine(-where => {col3 => $val3, col2 => {"<" => $max}});

is equivalent to

   $statement->refine(-where => {col1 => $val1, 
                                 col2 => {">" => $min,
                                          "<" => $max},
                                 col3 => $val3});

Refining a statement can only occur when the statement is still in status NEW; after "sqlize()" has been called, parameters are frozen.

sqlize()

  $statement->sqlize(%args);

Generates SQL from all parameters accumulated so far in the statement. The statement switches from state NEW to state SQLIZED, which forbids any further refinement of the statement (but does not forbid further bindings).

Arguments are optional, and are just a shortcut instead of writing

  $statement->refine(%args)->sqlize;

prepare()

  $statement->prepare(%args);

Method sqlized is called automatically if necessary. Then the SQL is sent to the database, and the statement handle returned by DBI ($sth) is stored internally within the statement. The state switches to PREPARED.

Arguments are optional, and are just a shortcut instead of writing

  $statement->sqlize(%args)->prepare;

execute()

  $statement->execute(@bindings);

Translates the internal named bindings into positional bindings, calls "execute" in DBI on the internal $sth, and applies the -pre_exec and -post_exec callbacks if necessary. The state switches to EXECUTED.

Arguments are optional, and are just a shortcut instead of writing

  $statement->bind(@bindings)->execute;

An executed statement can be executed again, possibly with some different bindings. When this happens, the internal result set is reset, and fresh data rows can be retrieved again through the "next" or "all" methods.

sql()

  $sql         = $statement->sql;
  (sql, @bind) = $statement->sql;

In scalar context, returns the SQL code for this statement (or undef if the statement is not yet sqlized).

In list context, returns the SQL code followed by the bind values, suitable for a call to "execute" in DBI.

Obviously, this method is only available after the statement has been sqlized (through direct call to the "sqlize" method, or indirect call via "prepare", "execute" or "select").

bind()

  $statement->bind(foo => 123, bar => 456);
  $statement->bind({foo => 123, bar => 456}); # equivalent to above

  $statement->bind(0 => 123, 1 => 456);
  $statement->bind([123, 456]);               # equivalent to above

Takes a list of bindings (name-value pairs), and associates them to named placeholders within the statement. Named placeholders are defined at the DBIx::DataModel level, not within the database.

If successive bindings occur on the same named placeholder, the last value silently overrides previous values. If a binding has no corresponding named placeholder, it is ignored. Names can be any string (including numbers), except reserved words limit and offset, which have a special use for pagination.

The list may alternatively be given as a hashref. This is convenient for example in situations like

  my $rows      = $source->select(...);   
  my $statement = $source->some_method;
  foreach my $row (@$rows) {
    my $related_rows = $statement->bind($row)->select;
  }

The list may also be given as an arrayref; this is equivalent to a hashref in which keys are positions within the array.

Finally, there is a ternary form of bind for passing DBI-specific arguments.

  use DBI qw/:sql_types/;
  $statement->bind(foo => $val, {TYPE => SQL_INTEGER});

See "bind_param" in DBI for explanations.

reset()

  $statement->reset(%args);

Resets the statement back into NEW state, erasing all information except references to the $schema and $meta_source.

%args are optional; if present, they are passed to the "refine()" method.

Proxy methods

Since the statement holds references to a schema and a meta-source, there are some accessors or proxy methods to these objects.

schema()

Accessor for the statement's schema.

source()

Accessor for the statement's meta-source.

Utility methods

bless_from_DB()

  $statement->bless_from_DB($record);

Blesses $record into an object of $statement->meta_source->class, and applies the from_DB column handlers.

SOURCES

Principle

Data rows coming from the database are blessed into source classes, which are either tables or joins; such classes implement instance methods for manipulating the row objects, or for navigating to related rows. Source classes do not contain information about the data structure (which database table, which associations, etc.); for getting such information, use the metadm method to access the metasource object.

Path methods specific to each source

When an "Association()" is defined between two tables, methods are automatically added into the corresponding classes; so for example with

  $schema->Association([qw/Department department 1 /],
                       [qw/Activity   activities * /]);

the Department class will have an activities() method, and the Activity class will have a department() method. Such methods are called path methods; their names correspond to the UML roles defined in the association. Path methods always take the same arguments as the "select()" method (actually, they are implemented through an internal call to select()).

UML role names should preferably be chosen to avoid collisions with the builtin methods listed below. However, should a conflict occur, it is always possible to alleviate the ambiguity using a fully qualified method name: for instance if a table has a path method that collides with the "schema()" method presented below, we can write

  # calling the path method
  my $related_row_called_schema = $data_row->schema(); 

  # calling the builtin method
  my $dbidm_schema = $data_row->DBIx::DataModel::Source::schema();

A join between several tables (see the "join()" method) creates a new class that inherits all path methods from all tables participating in the join; in case of name conflics, the latest table takes precedence. Again, it is possible to use fully qualified method names if necessary.

Accessors

Source::metadm()

Returns the meta-source instance associated with the invocant.

schema()

Returns the instance of DBIx::DataModel::Schema from which the current data row was retrieved. When in single-schema mode, the schema comes from $self->metadm->schema->class->singleton; when in multi-schema mode, a reference to the schema is kept under __schema within each row object.

primary_key()

  my @primary_key_columns = $class->primary_key;
  my @primary_key_values  = $object->primary_key;

If called as a class method, returns the list of columns registered as primary key for that table (via "Table()"), or computed as primary key for that view (concatenation of primary keys of joined tables that are in a 1-to-many association).

If called as an instance method, returns the list of values in those columns.

When called in scalar context and the primary key has only one column, returns that column (so you can call my $k = $obj->primary_key).

Data manipulation

expand()

  $row->expand($path, %options);

Executes the method $path to follow an Association, stores the result in the object itself under $row->{$path}, and returns that result. This is typically used to expand an object into a tree datastructure. If present, %options are passed to $row->$path(%options), for example for specifying -where, -columns or -order_by options.

After the expansion, further calls to $row->$path (without any arguments) will reuse that same expanded result instead of calling the database again. This caching improves efficiency, but also introduces the risk of side-effects across your code : after

  $row->expand(some_path => (-columns => [qw/just some columns/],
                             -where   => {some_field => 'restriction'}))

further calls to $row->some_path() will just return a dataset restricted according to the above criteria, instead of a full join. To prevent that effect, you would need to delete $row->{some_path}, or to call the role with arguments, like $row->some_path(-columns => '*').

auto_expand()

  $record->auto_expand( $with_recursion );

Asks the object to expand itself with some objects in foreign tables. Does nothing by default. Should be redefined in subclasses, most probably through the "define_auto_expand" method. If the optional argument $with_recursion is true, then auto_expand is recursively called on the expanded objects.

apply_column_handler()

  $class ->apply_column_handler($handler_name, \@rows);
  $row   ->apply_column_handler($handler_name);

Inspects the target object or list of objects; for every column that exists in the object, checks whether a handler named $handler_name was declared for that column (see methods "define_type" and "define_column_type"), and if so, calls the handler. By this definition, if a column is absent in an object, then the handler for that column is not called, even though it was declared in the class.

The results of handler calls are collected into a hashref, with an entry for each column name. The value of each entry depends on how apply_column_handlers was called : if it was called as an instance method, then the result is something of shape

  {column_name1 => result_value1, column_name2 => result_value2, ... }

if it was called as a class method (i.e. if \@objects is defined), then the result is something of shape

  {column_name1 => [result_value1_for_object1, result_value1_for_object2, ...], 
   column_name2 => [result_value2_for_object1, result_value2_for_object2, ...], 
   ... }

If column_name is not present in the target object(s), then the corresponding result value is undef.

has_invalid_columns

  my $invalid_columns = $row->has_invalid_columns;
  if ($invalid_columns) {
    print "wrong data in columns ", join(", ", @$invalid_columns);
  }
  else {
   print "all columns OK";
  }

Applies the 'validate' handler to all existent columns. Returns a ref to the list of invalid columns, or undef if there are none.

Note that this is validation at the column level, not at the record level. As a result, your validation handlers can check if an existent column is empty, but cannot check if a column is missing (because in that case the handler would not be called).

Your 'validate' handlers, defined through "define_type", should return 0 or an empty string whenever the column value is invalid. Never return undef, because we would no longer be able to distinguish between an invalid existent column and a missing column.

TO_JSON

  use JSON;
  my $json_converter = JSON->new->convert_blessed(1);
  my $json_text      = $json_converter->encode($data_row);

The Source class implements a TO_JSON method, so that data rows from any table or any join can be easily converted into JSON strings (including nested rows that may have been added by the "expand" method). See "convert_blessed" in JSON for more explanations.

The TO_JSON method merely copies the object hash into a plain hash, and removes the __schema slot.

Methods delegated to the ConnectedSource class

Table::update()

  $source_class->update(@update_args);
  $source_instance->update({field1 => $val1, ...});
  $source_instance->update();

This is both a class and an instance method, that implicitly creates a a connected source from the source's schema and meta-source, and then calls the update() method on that connected source.

When used as a class method, the argument syntax is exactly like for Connected_source::update().

When used as an instance method with arguments, this is equivalent to

  $connected_source->update($source_instance->primary_key,
                            {field1 => $val1, ...});

in other words, the invocant is only used for taking its primary key, and the updated fields are passed as an argument hashref.

When used as an instance method without arguments, this is equivalent to

  $connected_source->update($source_instance);

in other words, the updated fields are all column values stored in memory within the object (ignoring all non-scalar values).

Table::delete()

  $source_class->delete(@delete_args);
  $source_instance->delete();

Exactly like the update() method just above, but calling ConnectedSource::delete().

Table::join()

Calls "ConnectedSource::join()" to create a statement, and immediately binds it to the current object. So for example if $emp->{emp_id} == 987, then

  $emp->join(qw/activities department/)
      ->select(-where => {d_end => undef})

will generate

  SELECT * FROM Activity INNER JOIN Department 
                         ON Activity.dpt_id = Department.dpt_id
           WHERE emp_id = 987 
             AND d_end IS NULL

Observe that the WHERE clause contains a combination of criteria, coming on one hand from the initial $emp object, and on the other hand from the regular -where clause within the select().

class methods delegated to the ConnectedSource class

When in single-schema mode, some class methods are delegated to a connecte source created on the fly, using the "singleton()" method; hence it is possible to write requests like

  my $result = HR::Employee->select(...);

instead of

  my $result = HR->table('Employee')->select(...);

Source::select()

  my $result = $source->select(%args);

Equivalent to

  my $metadm      = $source->metadm;
  my $meta_schema = $metadm->schema;
  my $schema      = $meta_schema->class->singleton;
  my $cs_class    = $meta_schema->connected_source_class;
  $cs_class->new($metadm, $schema)->select(%args);

Source::fetch()

similar to select() above

Source::fetch_cached()

similar to select() above

Source::bless_from_DB()

similar to select() above

META-SCHEMA NAVIGATION

Meta-schema methods

tables()

  my @meta_tables  = $meta_schema->tables;

Returns all DBIx::DataModel::Meta::Source::Table instances declared in this $meta_schema.

Meta::Schema::table()

  my $meta_table   = $meta_schema->table($table_name);

Returns the single instance of DBIx::DataModel::Meta::Source::Table with name $table_name, or undef.

associations()

  my @associations = $meta_schema->associations;

Returns all DBIx::DataModel::Meta::Association instances declared in this $meta_schema.

association()

  my $association  = $meta_schema->associations($association_name);

Returns the single instance of DBIx::DataModel::Meta::Source::Association with name $association_name, or undef.

types()

  my @types        = $meta_schema->types;

Returns all DBIx::DataModel::Meta::Type instances declared in this $meta_schema.

type()

  my $type         = $meta_schema->type($type_name);

Returns the single instance of DBIx::DataModel::Meta::Type with name $type_name, or undef.

joins()

  my @joins        = $meta_schema->joins;

Returns all DBIx::DataModel::Meta::Source::Join instances declared in this $meta_schema.

Meta::Schema::join()

  my $join         = $meta_schema->join($join_name);

Returns the single instance of DBIx::DataModel::Meta::Source::Join with name $join_name, or undef.

other accessors

Accessor methods are defined for the following members of the $meta_schema :

class()
sql_no_inner_after_left_join()
auto_insert_columns()
auto_update_columns()
no_update_columns()
table_parent()
table_metaclass()
join_parent()
join_metaclass()
association_metaclass()
path_metaclass()
type_metaclass()
statement_class()
connected_source_class()

Meta-source methods

Accessor methods are defined for the following members of a $meta_source (instance of either DBIx::DataModel::Meta::Source::Table or DBIx::DataModel::Meta::Source::Join) :

schema()
class()
default_columns()
parents()
primary_key()
aliased tables()
name()

In addition, the following methods return dynamic lists :

ancestors()

Returns a flattened list of recursive calls to the "parents()" method.

auto_insert_column()

Returns a flattened hash, built from auto_insert_columns declared in this source, in its ancestors, or in the $meta_schema.

auto_update_column()

Returns a flattened hash, built from auto_update_columns declared in this source, in its ancestors, or in the $meta_schema.

no_update_column()

Returns a flattened hash, built from no_update_columns declared in this source, in its ancestors, or in the $meta_schema. Keys are column names, values are insignificant.

path()

  my %all_path      = $meta_source->path;
  my $specific_path = $meta_source->path($path_name);

Without any argument, returns a flattened hash of all paths accessible from this source : keys are path names, and values are instances of DBIx::DataModel::Meta::Path.

If a $path_name is supplied, returns the corresponding metapath object.

db_from()

Returns what will be injected as -from argument into the "select" in SQL::Abstract::More call.

where()

Returns the optional "where" condition associated with this source (in the case of "View()").

Meta-table methods

In addition to the $meta_source methods above, the following methods are defined for an instance of DBIx::DataModel::Meta::Source::Table :

components()

Returns the list of other meta-sources that have been declared as components of this source, through the "Composition()" declaration.

Association methods

schema()

The DBIx::DataModel::Meta::Schema instance in which this association is declared.

name()

Returns the association name

kind()

Returns the association kind (Association or Composition).

path_AB()

Returns the DBIx::DataModel::Meta::Path object describing the path from A to B.

path_BA()

Returns the DBIx::DataModel::Meta::Path object describing the path from B to A.

Path methods

name()

The name of this path.

from()

Reference to the DBIx::DataModel::Meta::Source::Table where this path starts.

to()

Reference to the DBIx::DataModel::Meta::Source::Table where this path ends.

on()

Hashref for generating the join condition (keys are colums for the left-hand side, values are columns for the right-hand side).

multiplicity()

see "Association methods"

association()

Reference to the DBIx::DataModel::Meta::Association that created this path.

direction()

Either "AB" or "BA".

opposite()

Returns the path object representing the opposite direction.

Type methods

schema()

The DBIx::DataModel::Meta::Schema instance in which this type is declared.

name()

Name of this type.

handlers()

Hashref of handlers declared in this type (keys are handler names, values are handler bodies, i.e. coderefs).