The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XAO::DO::FS::List - List class for XAO::FS

SYNOPSIS

 my $customers_list=$odb->fetch('/Customers');

 my $customer=$customers_list->get('cust0001');

DESCRIPTION

List object usually used as is without overwriting it. The XAO class name for the list object is FS::List.

A list object provides methods for managing a list of FS::Hash objects of the same class -- storing, retrieving and searching on them.

List class shares most of the API with the Hash class.

Here is the list of all List methods (alphabetically):

check_name ()

Object names in lists have nearly the same set of restrains as in hashes with just one exception - they can start from a digit. Such behavior might be extended to hashes in later versions to eliminate this difference.

For example, 123ABC456 is a legal Hash id inside of a List, but is not a legal property ID or List ID inside of a Hash.

container_key ()

Returns key that refers to the current List in the upper level Hash.

container_object ()

Returns a reference to the hash that contains current list.

Example:

 my $customer=$orders_list->container_object();

Do not abuse this method, it involves a lot of overhead and can slow your application down if abused.

delete ($)

Deletes object from the list - first calls destroy on the object to recursively delete its content and then drops it from the list.

describe ()

Describes itself, returns a hash reference with at least the following elements:

 type       => 'list'
 class      => class name of Hashes stored inside
 key        => key name
detach ()

Detaches current object from the database. Not implemented, but safe to use in read-only situations.

exists ($)

Checks if an object with the given name exists in the list and returns boolean value.

get (@)

Retrieves a Hash object from the List using the given name.

As a convenience you can pass more then one object name to the get() method to retrieve multiple Hash references at once.

If an object does not exist an error will be thrown.

get_new ()

Convenience method that returns new empty detached object of the type, that list can store.

glue ()

Returns the Glue object which was used to retrieve the current object from.

key_charset

Returns key charset for the given list. Default is 'binary'.

key_format

Returns key format for the given list. Default is '<$RANDOM$>'.

key_length

Returns key length for the given list. Default is 30.

keys ()

Returns unsorted list of all keys for all objects stored in that list.

new (%)

You cannot use this method directly. Use some equivalent of the following code to get List reference:

 $hash->add_placeholder(name => 'Orders',
                        type => 'list',
                        class => 'Data::Order',
                        key => 'order_id');

....

 my $orders_list=$hash->get('Orders');
objtype ()

For all List objects always return a string 'List'.

put ($;$)

The only difference between list object's put() and data object's put() is that key argument is not required. Unique key would be generated and returned from the method if only one argument is given.

Key is guaranteed to consist of up to 20 alphanumeric characters. Key would uniquely identify stored object in the current list scope, it does not have to be unique among all objects of that class.

Value have to be a reference to an object of the same class as was defined when that list object was created.

Example of adding new object into list:

 my $customer=XAO::Objects->new(objname => 'Data::Customer');
 my $id=$custlist->put($customer);

Attempt to put already attached data object into an attached list under the same key name is meaningless and would do nothing.

Note: An object stored by calling put() method is not modified and remains in detached state if it was detached. If an object already existed in the list under the same ID - its content would be totally replaced by new object's content. It is safe to call put() to store attached object under new name - the object would be cloned. In order to retrieve new stored object from database you will have to call get().

scan (@)

A wrapper for search() method that makes it easier to search in large collections of data. Will use 'offset' and 'limit' options to search in multiple iterations and will use callbacks to process results.

Parameters are:

  block_size        => required, how many records to retrieve at once
  search_query      => optional array reference, see below in search() method
  search_options    => required, must include at least an 'orderby' option
  call_before       => optional, called before the start
  call_block        => optional, called with the full retrieved data block
  call_row          => optional, called for each row of the data
  call_after        => optional, called after the scanning is done

All callbacks get at least two arguments: the reference to the list or collection being scanned, and the reference to the parameters passed to scan(). Call_block also gets a reference to the block, and its zero-based offset. Call_row gets the result (either an array ref or a scalar, depending on 'result' option), its zero-based offset.

Scan() does not return anything, so at least one callback is necessary.

Example:

    $list->scan(
        block_size      => 10000,
        search_options  => {
            orderby => 'item_code',
            result  => [ 'item_code','cat_page' ],
        },
        call_block      => sub {
            my ($l,$a,$block,$offset)=@_;
            dprint "..$offset";
            foreach my $row (@$block) {
                #...
            }
        },
    );
search (@)

Returns a reference to the list of IDs of objects corresponding to the given criteria.

Takes a perl array or perl array reference of the following format:

 [ [ 'first_name', 'ws', 'a'], 'or', [ 'age', 'gt', 20 ] ]

All innermost conditions consist of exactly three elements -- first is an object property name, second is a comparison operator and third is some arbitrary value to compare field with or array reference.

As a convenience if right hand side value refers to an array in condition then the meaning of that block is to compare given field using given operator with all listed values and join results using OR logical operator. These two examples are completely equal and would be translated to the same database query:

 my $r=$list->search('name', 'wq', [ 'big', 'ugly' ]);

 my $r=$list->search([ 'name', 'wq', 'big' ],
                     'or',
                     [ 'name', 'wq', 'ugly' ]);

It is possible to search on properties of some objects related to the objects in the list. Let's say you have a list with specification values inside of a product. To search for products having specific value in their specification you would then do:

 my $r=$list->search(['Specification/name', 'eq', 'Width'],
                     'and',
                     ['Specification/value', 'eq', '123']);

You are not limited to object down the tree, you can search on object up the tree as well. Obviously this is mostly useful for collection objects because otherwise there is a single object on top and search turns into boolean yes/no ordeal.

Example:

 my $r=$invoices->search([ '/Customers/name', 'cs', 'John' ],
                         'and',
                         [ '../gross_premium', 'lt', 1000 ]);

Sometimes it might be necessary to check if a pair of objects inside of some container have specific properties. This can be achieved with instance specificators:

 my $r=$products->search([ [ 'Spec/1/name', 'eq', 'Width' ],
                           'and',
                           [ 'Spec/1/value', 'eq', '123' ],
                         ],
                         'and',
                         [ [ 'Spec/2/name', 'eq', 'Height' ],
                           'and',
                           [ 'Spec/2/value', 'eq', '345' ],
                         ]);

Numbers 1 and 2 here suggest that first name/value pair must be checked on the same object, while the second - on another. Numbers do not have any meaning by themselves - 1 and 2 can be substituted with 234 and 345 without changing effect in any way. Some very complex criteria can be expressed this way and in most cases execution by the underlying database layer will be quite optimal as no postprocessing is usually required.

Another example is to use asterisk which means "assume a new instance every time". This can be useful if we want to find an object which container contains a couple of objects each satisfying some simple criteria. For instance, to find an imaginary person profile that has both sound and image attached:

 my $r=$profiles->search([ 'Files/*/mime_type', 'sw', 'image/' ],
                         'and',
                         [ 'Files/*/mime_type', 'sw', 'audio/' ]);

In theory bizarre cases like this should work as well, although no good example of real life usage comes to mind:

 my $r=$list->search([ '../../A/1/B/2/C/name', 'cs', 't1' ],
                     'and',
                     [ '/X/A/2/B/1/C/desc', 'eq', 't2' ]);

See also 'index' option below for a way to suggest a most effective index.

This can be extended as deep as you want. See also collection() method on Glue and XAO::DO::FS::Collection for additional search capabilities.

Multiple blocks may be combined into complex expressions using logical operators.

Comparison operators:

cs

True if the field contains given string. There are no limitations as to what could be in the string. Having dictionary on the field will not speed up search.

eq

True if equal.

ge

True if greater or equal.

gt

True if greater.

le

True if less or equal.

lt

True if less.

ne

True if not equal.

sw

True if property starts with the given string. For example ['name', 'sw', 'mar'] will match 'Marie Ann', but will not match 'Ann Marie'.

In most databases (MySQL included) this type of search is optimized using indexes if they are available. Consider making the field indexable if you plan to perform this type of search frequently.

wq

True if property contains the given word completely. For example ['name', 'wq', 'ann'] would match 'Ann Peters' and 'Marie Ann', but would not match 'Annette'.

ws

True if property contains a word that starts with the given text. For example ['name', 'ws', 'an'] would match 'Andrew' and 'Marie Ann', but 'Joann' would not match.

Logical operators:

and - true if both are true (has an alias -- '&&')
or - true if either one is true (has an alias -- '||')

Examples:

 ##
 # Search for persons in the given age bracket
 #
 my $list=$persons->search([ 'age', 'ge', 25 ],
                           'and',
                           [ 'age', 'le', 35 ]);

 ##
 # A little more complex search.
 #
 my $list=$persons->search([ 'name', 'ws', 'john' ],
                           'and',
                           [ [ 'balance', 'ge', 10000 ],
                             'or',
                             [ 'rating', 'ge', 2.5 ]
                           ]);

The search() method can also accept additional options that can alter results. Supported options are:

orderby

To sort results using any field in either ascending or descending order. Example:

 my $list=$persons->search('age', 'gt', 60, {
                               'orderby' => [
                                   ascend => 'first_name',
                                   descend => 'second_name',
                               ]
                          });

Note, that you pass an array reference, not a hash reference to preserve the order of arguments.

If you want to order using just one field it is safe to pass that field name without wrapping it into array reference (sorting will be performed in ascending order then):

 my $list=$persons->search('age', 'gt', 60, {
                               'orderby' => 'first_name'
                          });

Caveats: There is no way to alter sorting tables that would be used by the database. It is generally safe to assume that english letters and digits would be sorted in the expected way. But there is no guarantee of that.

Remember that even though you sort results on the content of a field it is not that field that would be returned to you, you will still get a list of object IDs unless you also use 'result' option.

distinct

To only get the rows that have unique values in the given field. Example:

 my $color_ids=$products->search('category_id', 'eq', 123, {
                                    'distinct' => 'color'
                                });
debug

Turns on debug messages in the underlying driver. Messages will be printed to standard error stream and their content depends on the specific driver. Usually that would be a fully prepared SQL query just before sending it to the SQL engine.

index

Accepts one argument -- a field name (or a path to a field) that should be used as an index. Normally you do not need to use this option as in most cases underlying driver/database will make a right decision automatically. This might make sense together with 'debug' option and manual checking of specific queries performance.

Example which might make sense if you know for sure that restriction by image will significantly reduce number of hits, while ages range leaves too many matches open for checks.

 my $r=$list->search([ [ 'age', 'gt', 10 ],
                       'and',
                       [ 'age', 'lt', 60 ],
                     ],
                     'and',
                     [ 'Files/mime_type', 'sw', 'image/' ],
                     { index => 'Files/mime_type' });
limit

Indicates that you are only interested in some limited number of results allowing database to return just as many and therefor optimize the query or data transfer.

Remember, that you can still get more results then you ask for if underlying database does not support this feature.

 my $subset=$persons->search('eye_color','eq','brown', {
                                 'limit' => 100
                            });
offset

Indicates that you are only interested in results starting at the 'offset' position in the resulting set. In combination with 'limit' this can be used to page through large result sets.

result

This is a very powerful option that can significantly decrease your database load if used properly.

By default search() method returns a reference to an array of object keys. The 'result' option allows you to alter that behavior and receive database values in the same query, avoiding an extra step of retrieving the object and calling its get() method.

Note, that if you're going to use only a portion of the results it may be faster to get only IDs in the usual manner and then load the data you need. Loading multiple fields of data on more objects than you actually need may be significantly slower.

You can pass a single description of a return field or multiple descriptions as an array reference. Regardless of how many fields you request if 'result' option is used you always get a reference to an array of arrays.

The returning array rows may contain more records than you requested, but it is guaranteed that they will contain at least as many as requested and in the same order as requested. It's best to ignore the extra fields should you get any, their presense or content are not guaranteed.

In addition to usual field names some special names are supported:

#connector

Returns the collection key for "parent" object (the containing the list being searched on). The value of the key can be used to pull the parent object from its class based collection. The typical use is to identify different object parents for future processing without pulling any of their data -- allows for faster searching without any table joins.

#container_key

Returns the value of the key in the list you're searching on.

#collection_key

Returns the value of the collection key for the object (collection-unique, can be used to pull this object from a collection rather than a specific list).

#id

Returns collection key if the search is being done on a collection, and a list key if on a list. In other words, returns the same ID that would have been returned without the 'result' option.

Examples:

 my $rr=$data->search(
    'last_name','cs','smit', {
    orderby => 'last_name',
    result => [qw(#container_key last_name first_name age)]
 });

Beware that these options usually significantly decrease search performance. Only use them when you would do sorting or select unique rows in your code anyway.

As a degraded case of search it is safe to pass nothing or just options to select everything in the given list. Examples:

 # These two lines are roughly equivalent. Note that you get an array
 # reference in the first line and an array itself in the second.
 #
 my $keys=$products->search();

 my @keys=$products->keys();

 # This is the way to get all the keys ordered by price.
 #
 my $keys=$products->search({ orderby => 'price' });

 # Getting name and surname for all records
 #
 my $data=$customers->search({ result => [qw(name surname)] });
values ()

Returns a list of all Hash objects in the list.

Note: the order of values is the same as the order of keys returned by keys() method. At least until you modify the object directly on indirectly. It is not recommended to use values() method for the reason of pure predictability.

uri ($)

Returns complete URI to either the object itself (if no argument is given) or to a property with the given name.

That URI can then be used to retrieve a property or object using $odb->fetch($uri). Be aware, that fetch() is relatively slow method and should not be abused.

Example:

 my $uri=$customer->uri;
 print "URI of that customer is: $uri\n";

AUTHORS

Copyright (c) 2005 Andrew Maltsev

Copyright (c) 2001-2004 Andrew Maltsev, XAO Inc.

<am@ejelta.com> -- http://ejelta.com/xao/

SEE ALSO

Further reading: XAO::FS, XAO::DO::FS::Hash (aka FS::Hash), XAO::DO::FS::Glue (aka FS::Glue).