The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Webservice::InterMine::Cookbook::Recipe5 - Dealing with Results

SYNOPSIS

  # Get a list of first authors of papers about
  # Even Skipped, sorted by the number of their
  # papers in the database

  use Webservice::InterMine ('www.flymine.org/query');

  my $service = Webservice::InterMine->get_service;

  my $query = $service->resultset('Gene')
                      ->select('publications.*')
                      ->where(Gene => {lookup => 'eve', extra_value => 'D. melanogaster'});

  # Print out number of results
  printf "Found %s publications\n", $query->count;

  # Print out the first 10 rows to examine them
  $query->show_first(10);

  # Print all the results to a file in TSV format
  $query->print_results(to => '/tmp/publications.tsv', columnheaders => 1);

  # Process rows one by one - without loading all data into memory
  while (my $aref = <$query>) {
    # handle row as aref
  }

  my $it = $query->results_iterator(as => 'hashrefs');
  while (my $href = <$it>) { 
    # handle row as href
  }

  # Get all results as big array of arrays
  my $results = $query->results;

DESCRIPTION

There are two primary things one might want to do with the results returned by a query: store them and process them. We try to make both of these common tasks as trivially simple as possible;

Storage

The most common data storage format is the flat file (there are other options too - please see Recipe7 - Extending Webservice::InterMine). Storing results in a flat file is as simple as:

  $query->print_results(to => 'some/file.tsv');

Processing

More useful perhaps is processing your results: normally you would download results from somewhere, read them into a program, munge the data into a suitable data-structure, and only then be able to actually process the results. Here you can do it all in one step, and never have to leave Perl to do so.

As well as returning rows as tab separated strings, results can be returned as an arrayref of either arrayrefs or hashrefs, depending on your needs.(1) This means that in most cases, your data is already in a format suitable for processing. In addition there is results can be retrieved as full Perl objects.

Above, we can see two basic examples of using arrayrefs and hashrefs to readily access your data. Arrayrefs are particularly useful if you want to process each field in the returned results, and you know what order they will be in (they are returned in the same order as the view list specified on the query). Hashrefs can be more useful for providing direct access to individual fields by name, and they can have the benefit of more declarative, and thus maintainable code. For this reason hashrefs are the default if you call $query->results; without any format specified.

For unpacking your results, the following pattern will prove useful:

  while (my $row = <$it>) {
    # process $row
  }

Even if you are expecting 100,000's of results, this is designed to be memory efficient by streaming the results line by line rather than storing them all in memory.

CONCLUSION

By default, result rows can be returned one of several different formats: strings (for flat file storage), and hash and array references (for processing). Hash and array references (of which the default is hashrefs) make for powerful and flexible data-structures which get out of the way between you and your data.

FOOTNOTES

(1) References in Perl. Perl has a sophisticated native system of references (similar to C-style pointers) and nested data structures. The two used most frequently (and used here) are references to arrays (arrayrefs) and references to hashes (hashrefs). These data-structures function exactly the same as normal hashes and arrays, but ways of referencing values in them differ:

  my @array    = ('one', 'two', 'three');
  my $arrayref = ['uno', 'duo', 'tre'];
  my $first_english = $array[0];
  my $first_italian = $arrayref->[0];

  my %hash    = (one => 'uno',  two => 'duo',  three => 'tre');
  my %hashref = {one => 'eins', two => 'zwei', three => 'drei'};
  my $italian_for_two = $hash{two};
  my $german_for_two  = $hashref->{two};

Note the differences in bracketing and the use of the arrow (dereferencing) operator.

SEE ALSO

perldoc perlref

AUTHOR

Alex Kalderimis <dev@intermine.org>

BUGS

Please report any bugs or feature requests to dev@intermine.org.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Webservice::InterMine

You can also look for information at:

COPYRIGHT AND LICENSE

Copyright 2006 - 2010 FlyMine, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.