The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Tangram::Storage - persistent object database

SYNOPSIS

   use Tangram;
   
   $storage = Tangram::Storage->connect( $schema,
      $data_source, $username, $password );

   $oid = $storage->insert( $obj );
   @oids = $storage->insert( @objs );

   $storage->update( $obj );
   $storage->update( @objs );

   $obj = $storage->load( $oid );
   @objs = $storage->load( @oids );

   @objs = $storage->select( $class );
   @objs = $storage->select( $remote, $filter );

   $cursor = $storage->cursor( $remote, $filter );

   if ($storage->oid_isa($oid, "ClassName")) {
       # oid $oid is a ClassName
   }

   $storage->disconnect();

DESCRIPTION

A Tangram::Storage object is a connection to a database configured for use with Tangram.

MEMORY MANAGEMENT

Starting with version 1.18, Tangram attempts to use the support for weak reference that was introduced in Perl 5.6. Whether that support is found or not has a major impact on how Storage influences object lifetime.

If weakref support is available, Storage uses weak references to keep track of objects that have already been loaded. This does not prevent the objects from being reclaimed by Perl. IOW, the client code decides how long an object remains in memory.

If weakref support is not available, Storage uses normal, 'strong' references. Storage will pin in memory all the objects that have been loaded and inserted through it, until you call "disconnect" or "unload".

In either case, Tangram will not break circular structures for you.

INTERNAL CONNECTION

Except in the implementation of cursor(), Tangram uses a single DBI connection in its operations. That connection is called the 'internal' connection. Since, in general, database managers do not allow multiple result sets on the same connection, the internal connection can be used only to carray a single task at a time.

Tangram::Cursors returned by cursor() do not suffer from this limitation because they use a separate DBI connection.

CLASS METHODS

connect

   $storage = connect( $schema,
      $data_source, $username, $auth, \%options )

Connects to a storage and return a handle object. Dies in case of failure.

$schema is an Tangram::Schema object consistent with the database.

$data_source, $username and $auth are passed directly to DBI::connect().

\%options is a reference to a hash that may contain the following fields:

  • dbh

    Pass in an already connected DBI handle

  • no_tx

    Specify explicitly whether or not transactions are possible. If they are not, then Tangram can guarantee consistency by serialising transaction updates - which guarantees poor performance and means that you can never use $storage->rollback.

    If you are using MySQL, you should consider using the InnoDB table type to avoid this problem. Also note that you will explicitly have to set this option if you have InnoDB tables configured, as there is no real way of telling if transactions are available for any given query without either trying to do a rollback, or querying the table types for every table. Which I don't think it's Tangram's duty to do!

  • no_subselects

    Functions that need to perform sub-selects will die immediately or attempt to emulate the functionality required, rather than relying on the RDBMS to return a failure.

    This is currently ignored, but that's not functionally relevant :-). It can be read as $storage->{no_subselects} however, as the correct value is automatically detected on connection.

All fields are optional.

dbh can be used to connect a Storage via an existing DBI handle. $data_source, $username and $auth are still needed because Tangram may need to open extra connections (see below).

INSTANCE METHODS

insert

   $storage->insert( @objs );

Inserts objects in storage. Returns the ID(s) assigned to the object(s). This method is valid in both "scalar and list contexts".

The inserted objects must be of a class described in the schema associated to the storage.

Attempting to insert an object that is already persistent in the storage is an error.

Tangram will automatically insert any object that is refered by $obj if it is not already present in storage. In the following example:

   my $homer = NaturalPerson->new(
      firstName => 'Homer', name => 'Simpson',
      children => Set::Object->new(
         NaturalPerson->new(
            firstName => 'Bart', name => 'Simpson' ),
         NaturalPerson->new(
            firstName => 'Lisa', name => 'Simpson' ),
         NaturalPerson->new(
            firstName => 'Maggie', name => 'Simpson'
      ) ) );

   $storage->insert( $homer );

...Tangram automatically inserts the kids along with Homer.

update

   $storage->update( @objs );

Save objects to storage. This method is valid in both "scalar and list contexts".

The objects must be of a class described in the schema associated to the storage.

Attempting to update an object that is not already present in the storage is an error.

Tangram will automatically insert any object that is refered by an inserted object if it is not already present in storage. It will not automatically update the refered objects that are already stored. In the following example:

   my $homer = NaturalPerson->new(
      firstName => 'Homer', name => 'Simpson' );
   $storage->insert( $homer );

   my $marge = NaturalPerson->new(
      firstName => 'Marge', name => 'Simpson',
      age => 34 );
   $storage->insert( $marge );

   $marge->{age} = 35;

   $homer->{partner} = $marge;

   $homer->{children} = Set::Object->new(
      NaturalPerson->new(
         firstName => 'Bart', name => 'Simpson' ),
      NaturalPerson->new(
         firstName => 'Lisa', name => 'Simpson' ),
      NaturalPerson->new(
         firstName => 'Maggie', name => 'Simpson' ) );

   $storage->update( $homer );

...Tangram automatically inserts the kids when their father is updated. OTOH, $marge will not be automatically inserted nor updated; her age will remain '34' in persistent storage.

id

   $id = $storage->id( $obj );
   @id = $storage->id( @obj );

Returns the IDs of the given objects. If an object is not persistent in storage yet, its corresponding ID is undef().

This method is valid in both "scalar and list contexts".

oid_isa

   if ($storage->oid_isa($id, "ClassName")) {
      ...
   }

Checks that the passed Object ID, $id, is a "ClassName" according to the schema. This check relies solely on the information in the schema, not Perl's idea of ->isa relationships.

load

   $obj = $storage->load( $id );
   @obj = $storage->load( @id );

Returns a list of objects given their IDs. Dies if any ID has no corresponding persistent object in storage.

This method is valid in both "scalar and list contexts".

remote

   @remote = $storage->remote( @classes );

Returns a list of Tangram::Remote objects of given classes. This method is valid in both "scalar and list contexts".

select

   @objs = select( $remote );

   @objs = select( $remote, $filter );

   @objs = select( $remote,
      opt1 => val1, opt2 => val2, ...);

Valid only in list context. Returns a list containing all the objects that satisfy $filter.

$remote can be either a Remote object of an array of Remote objects. If it is a single Remote, a list of objects is returned. If it is an array, a list of array of objects is returned.

If one argument is passed, return all the objects of the given type.

If two arguments are passed, the second argument must be a Filter. select() returns the objects that satisfy $filter and are type-compatible with the corresponding Remote.

If more than two arguments are passed, the arguments after $remote are treated as key/value pairs. Currently Tangram recognizes the following directives:

  • filter

  • distinct

  • order

  • desc

  • distinct

  • limit

filter specifies a Filter that can be used to restrict the result set.

distinct specifies that each object in the result set must be unique (Tangram generates a SELECT DISTINCT).

order specifies attributes in terms of one or more of the remote objects - any that are being selected, or any that appear in the filter.

desc specifies that the order should be descending. Order of which of the columns in the order category, you might ask? The last one for now :-}. This syntax is therefore deprecated, to be replaced with a unary - operator to order columns in some future Tangram release.

distinct is a boolean; a true value specifies that the same object should ocur only once in the result set. In general, this is a good idea;

limit is a maximum number of rows to retrieve; in fact, with some databases you can give two numbers to this to get the rows between N and M of a select. See your RDBMS manual for more. If you want to specify more than one number, you may use the following syntax:

   $storage->select( $object, filter => (...),
                     limit => [ 5, 10 ] );

The above example would return rows 6 through 15 on a MySQL database.

The select method is valid only in list context.

cursor

   $cursor = $storage->cursor( $remote );
   $cursor = $storage->cursor( $remote, $filter );
   $cursor = cursor( $remote,
      opt1 => val1, op2 => val2, ...);

Valid only in scalar context.

Returns a Cursor on the objects that are type-compatible with $remote.

If one argument is passed, the cursor returns all the objects of the given type.

If two arguments are passed, the second argument must be a Filter. The cursor returns the objects that satisfy $filter and are type-compatible with the corresponding Remote.

If more than two arguments are passed, the arguments after $remote are treated as key/value pairs. Currently Tangram recognizes the following directives:

  • filter

  • order

  • desc

  • distinct

  • retrieve

For options filter, order, desc and distinct, see select.

Option retrieve is an array of Expr, to be retrieved in addition to the object itself.

prefetch

   $storage->prefetch("Class", "collection", $filter);

This method fetches all the "collection" collections from "Class", where $filter.

You need to be very careful with your filter - it is quite easy to end up with a filter that will include a single table twice with no join.

You should not include an expression in the filter that matches the type of object that you are prefetching, unless that is a *different* object to the one you want to load.

You should replace the text "Class" with a Tangram::Remote object from your $filter if it appears in the expression.

This code is OK:

   my $r_parent = $storage->remote( "NaturalPerson" );
   my $filter = ($r_parent->{age} > 40);

   my @parent = $storage->select($r_parent, $filter);
   $storage->prefetch($r_parent, "children" $filter);

But this code has the problem:

   my $r_parent = $storage->remote( "NaturalPerson" );
   my $r_child  = $storage->remote( "NaturalPerson" );

   my $filter = (
                 ($r_parent->{age} > 40) &;
                  $r_parent->{children}->includes($r_child)
                );

   my @parent = $storage->select($r_parent, $filter);
   my @children = $storage->select($r_child, $filter);

   $storage->prefetch($r_parent, "children", $filter);

Because $filter contains an extra `unnecessary' relationship with $r_child, the filter that Tangram builds internally ends up looking like:

    (
     ($r_parent->{age} > 40) &
     $r_parent->{children}->includes($r_child) &
     $r_parent->{children}->includes($r_child2) &
    );

So, you end up including extra tables without joining them. This situation does not make any sense, but unfortunately because of the definition of how RDBMS' work, it is required behaviour for it to give you a permutation of all of the unjoined tables. <sigh>

erase

   $storage->erase( @obj );

Removes objects from persistent storage. The objects remain present in transient storage.

tx_start

   $storage->tx_start();

Starts a new Tangram transaction.

Tangram transactions can be nested.

Tangram maintains a transaction nesting count for each storage and commits the operations only when that count reaches zero. This scheme makes it easy for a function to collaborate with its caller in the management of the "internal connection".

Example:

   sub f
   {
      $storage->tx_start();
      $storage->update( $homer );
      $storage->tx_commit(); # or perhaps rollback()
   }

   sub g
   {
      $storage->tx_start();
      f();
      $storage->update( $marge );
      $storage->tx_commit(); # or perhaps rollback()
   }

   f(); # 1
   g(); # 2

In (1), f() commits the changes to $homer directly to the database.

In (2), f() transparently reuses the transaction opened by g().\ Changes to both $homer and $marge are commited to the database when g() calls tx_commit().

tx_commit

   $storage->tx_commit();

Commits the current Tangram transaction for this storage. If the transaction being commited is the outermost transaction for this storage, the DBI transaction is also commited.

tx_rollback

   $storage->tx_rollback();

Rolls back the current Tangram transaction for this storage. If the transaction being rolled back is the outermost transaction for this storage, the DBI transaction is also rolled back.

tx_do

   $storage->tx_do( sub { ... } );

Executes CODEREF under the protection of a Tangram transaction and pass it @args in the argument list.

Rolls back the transaction if CODEREF dies; in which case the exception is re-thrown.

Returns the results of CODEREF, either as a scalar or as a list depending on the context in which tx_do was called.

Example:

   $storage->tx_do(
      sub
      {
         $storage->update( $homer );
         # do things, die perhaps
         $storage->update( $marge );
      } );

Both $homer and $marge will be updated, or none will, depending on whether the anonymous subroutine passed to tx_do() dies.

unload

   $storage->unload( @obj );

Drops references to persistent objects present in memory. @objs may contain both objects and object ids. If @objs is empty, unloads all the objects loaded by this storage.

Storage keeps track of all the persistent objects that are present in memory, in order to make sure that loading the same object twice results in a single copy of the object.

As a consequence, these objects will not be reclaimed by Perl's automatic memory management mechanism until either disconnect() or unload() is called.

unload() should be called only when no other references exist to persistent objects, otherwise the same object (in the database) may end up having two copies in transient storage.

disconnect

   $storage->disconnect();

Disconnects from the database. Drops references to persistent objects present in memory (see unload).