The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Alzabo::Runtime::Cursor - Base class for Alzabo cursors

SYNOPSIS

  use Alzabo::Runtime::Cursor;

DESCRIPTION

This is the base class for cursors.

METHODS

new

Virtual method.

all_rows

Virtual method.

errors

If the last next call encountered a situation where the SQL query returned primary keys not actually in the target table, then the exception objects are stored in the cursor. This method can be used to retrieve these objects. This allows you to ignore these errors if you so desire without having to do explicit exception handling. For more information on what you can do with this method see the HANDLING ERRORS section.

Returns

A list of Alzabo::Exception::NoSuchRow objects.

reset

Resets the cursor so that the next next call will return the first row of the set.

HANDLING ERRORS

Let's assume a database in with following tables:

 TABLE NAME: movie

 movie_id          int
 title             varchar(50)

 TABLE NAME: movie_alias

 alias_id           int
 movie_id           int
 alias              varchar(50)

Now, let's assume you have a schema object $schema and you execute the following code:

 my $cursor = $schema->join( tables => [ $schema->tables( 'movie', 'movie_alias' ) ],
                             select => [ $schema->table('movie') ],
                             where  => [ $schema->table('movie_alias')->column('alias'), 'like', 'Foo%' ] );

The cursor returned is relying on the movie_id column in the movie_alias table. It's possible that there are values in this column that are not actually in the movie table but the cursor object will ignore the exceptions caused by these bad ids. The next method will not return a false value until its underlying DBI statement handle stops returning data. The reasoning behind this is that otherwise there would be no way to distinguish between: A) a false value caused by there being no more data coming back from the query on the movie_alias table and B) a false value caused by there being no row in the movie column matching a given movie_id value.

It is certainly possible that there are situations when you don't care about referential integrity and you want to simply get all the rows you can. In other cases, you will want to handle errors. I would have used exceptions for this purpose except the following code would then not function properly.

 while ( my $row = eval { $cursor->next } )
 {
     do_something if $@;  # or alternately just ignore $@

     ... do something with $row ...
 }

The reason is that throwing an exception in the eval block would cause the eval to return an undef. This means that the 'do_something if $@;' clause would _never_ get executed. In that case, you couldn't ignore the exception if you wanted to because it interrupts the while loop. The workaround would be:

 do
 {
     my $row = eval { $cursor->next };
     # either do something with $@ or ignore it.
 } while ( $row || ( $@ && $@->isa('Alzabo::Exception::NoSuchRow') );

However, this is not an idiom I particularly want to encourage, as it is counter-intuitive.

So, while throwing an exception may be the most 'correct' way to do it, I've instead created the errors method.

This means that the idiom for checking errors from the next method is as follows:

 while ( my $row = $cursor->next )
 {
     do_something if $cursor->errors;

     ... do something with $row ...
 }

The advantage here is that ignoring the exception is easy. If you want to check them then just remember that the errors method will return a list of Alzabo::Exception::NoSuchRow objects that occurred during the previous next call.

Also note that other types of exceptions are rethrown from the next method.

RATIONALE FOR CURSORS

Using cursors is definitely more complicated. However, there are two excellent reasons for using them: speed and memory savings. As an example, I did a test with the old code (which returned all its objects at once) against a table with about 8,000 rows using the Alzabo::Runtime::Table->all_rows method. Under the old implementation, it took significantly longer to return the first row. Even more importantly than that, the old implementation used up about 10MB of memory versus about 4MB! Now imagine that with a 1,000,000 row table.

For those curious to know why, here's the reason. Under perl, the following code:

 foreach (1..1_000_000)
 {
     print "$_\n";
 }

first constructs a temporary array with _all_ the values (that's an array one million scalars long!) and then returns it one by one. It takes a nontrivial amount of time to construct that array, meaning that the first print statement is delayed. Even worse, the array uses up memory.

Thus Alzabo now uses cursors so can scale better. This is a particularly big win in the case where you are working through a long list of rows and may stop before the end is reached. With cursors, Alzabo creates only as many rows as you need. Plus the start up time on your loop is much, much quicker. In the end, your program is quicker and less of a memory hog. This is good.

AUTHOR

Dave Rolsky, <autarch@urth.org>