NAME
Tie::Google - Single-variable access to Google search results
SYNOPSIS
my $KEYFILE = glob "~/.googlekey";
my ($g, @g, %g);
# Tied array interface
tie @g, "Tie::Google", $KEYFILE, "perl";
for my $r (@g) {
printf " * <a href='%s'>%s</a>\n",
$r->{'URL'}, $r->{'title'};
}
# Tied hash interface
tie %g, "Tie::Google", $KEYFILE;
for my $term (qw[ perl python ruby ]) {
my $res = $g{$term};
printf "%d results for '%s:\n", scalar @$res, $term;
for my $r (@$res) {
printf " * <a href='%s'>%s</a>\n",
$r->{'URL'}, $r->{'title'};
}
}
# Tied scalar interface: I Feel Lucky
use LWP::Simple qw(getprint);
tie $g, "Tie::Google", $KEYFILE, "perl";
getprint($g->{'URL'});
USING Tie::Google
Using tied variables can make searching Google much simpler for trivial programs. Tie::Google
presents a simple interface to Google's search API, using Net::Google
as the underlying transport mechanism. To use Tie::Google
, you must already be registered with Google and have an API key.
You can tie scalars, arrays, or hashes to the Tie::Google
class; each offers slightly different functionality, but all offer direct access to Google search results. The basic syntax of all types is:
tie VAR, 'Tie::Google', $APIKEY, $QUERY, \%OPTIONS;
where:
- VAR
-
VAR is the variable name, which can be a scalar, array, or hash:
tie $g, "Tie::Google", $KEY, $term; tie @g, "Tie::Google", $KEY, $term; tie %g, "Tie::Google", $KEY, $term;
- $APIKEY
-
APIKEY is your Google API key or a file containing the key as the only item on the first line; see http://apis.google.com for details.
- $QUERY
-
QUERY is your actual search term(s), as a string. This can be arbitrarily complex, up to the limits Google allows:
tie $g, "Tie::Google", $KEY, "site:cnn.com allintitle:priest court -judas";
- %OPTIONS
-
Any options specified in this hashref will be passed to the Net::Google::Search instance. Available options include
starts_at
,max_results
,ie
,oe
, andlr
. See Net::Google.
The Tied Array Interface
Tieing an array to Tie::Google
gives you an array of search results. How many search results are returned depends on the value of the max_results
option defined when the array was tied (or $DEFAULT_BATCH_SIZE if max_results
was not set), though extending the array of results can be done by growing the array.
$#g = 20;
Will resize the result set to 20 results. If there are more than 20, the ones on the end will be popped off; if there are less than 20, then more will be retrieved.
Tie::Google
supports all non-additive array operations, including shift
, pop
, and the 3 argument form of splice
(not the 4 argument version). Specifically unallowed are unshift
, push
, and general assigment.
See "RESULTS" for details about the individual search results.
The Tied Hash Interface
The tied hash interface is similar to the tied array interface, except that there are a bunch of them. Asking for a key in the hash %g initiates a search to Google, with the specified key as the search term:
my $results = $g{'perl apache'};
This initiates a search with "perl apache" as the query. $results is a reference to an array of hashrefs (see "RESULTS" for details about said hashrefs).
Tied hashes support all hash functions, including each
, keys
, and values
. Deleting from the hash is allowed (it removes the search results for the deleted terms), but adding to the hash is not.
There can be many sets of search results stored in a tied hash. To see what this looks like, try this:
use Data::Dumper;
my (%g, $KEY, $dummy);
tie %g, "Tie::Google", $KEY;
$dummy = $g{'perl'};
$dummy = $g{'python'};
$dummy = $g{'ruby'};
print Dumper(tied(%g));
Also, for comparison, try:
tie @g, "Tie::Google", $KEY, "perl";
print Dumper(tied(@g));
If starts_at
or max_results
are specified during the tie
, these options are carried over into new searches (when a new key is requested from the hash), so plan accordingly. If the max_value
is set to 1000, for example, then every access of a new key is going to contain 1000 elements, which will be pretty slow.
The Tied Scalar Interface
Do you feel lucky? If so, tie a scalar to Tie::Google
:
tie $g, "Tie::Google", $KEY, "python";
Will give you the top result. This is conceptually similar to using the "I Feel Lucky" button on Google.com's front search interface.
RESULTS
All results (values returned from these tied variables) are hash references; the contents of these hashrefs are based on the Net::Google::Response
class (see Net::Google::Response for details). These elements currently are:
title
URL
snippet
cachedSize
directoryTitle
summary
hostName
directoryCategory
All keys are case sensitive, and return exactly what Net::Google::Response says they do (Tie::Google
does no massaging of this data).
TODO / BUGS
This module is far from complete, or even fully thought out. TODO items currently include:
The tests currently suck, to the point of embarrassment. Don't mention it, I'm a little sensitive about it.
Some of the behaviors are kind of wonky. If anyone has any better ideas, please let me know. Patches demanded^Wwelcome^.
Tied arrays should get the next 10 results when you get to the end of the array. Currently, you have to manually extend the array using:
$#g = 100;
to get 100 search results.
Although this technique will have the unfortunate side-effect of trying to iterate through all the results in Google's database.
The tied hash interface should be implemented in terms of the tied array interface. That is, the values associated with each key (search term) should be a reference to an array tied to
Tie::Google
. I started doing it this way but it made my brain hurt.Does there need to be a TIEHANDLE interface as well? Hmmm...
while (<$google>) { ...
Should returned search results be data structures (they are currently), the actual
Net::Google::Result
instances (where the data is currently being derived from), or new objects in their own right (e.g.,Tie::Google::Result
)? I see advantages to each path: data structures would be simpler, passing onResult
objects without modification would be faster, and using a new set of objects allows new functionality to be added, for example useful stringification.
SEE ALSO
AUTHOR
darren chamberlain (<darren@cpan.org>), with some prompting from Richard Soderberg.