WWW::Scraper - framework for scraping results from search engines.
NOTE: You can find a full description of the Scraper framework in WWW::Scraper::ScraperPOD.pm.
use WWW::Scraper; $scraper = new WWW::Scraper('engineName', $queryString); $scraper->GetRequest->$fieldName($fieldValue); $response = $scraper->next_response(); print $response->$fieldName();
"Scraper" is a framework for issuing queries to a search engine, and scraping the data from the resultant multi-page responses, and the associated detail pages.
As a framework, it allows you to get these results using only slight knowledge of HTML and Perl. (All you need to know you can learn by reading WWW::Scraper::ScraperPOD.pm.)
A Perl script, "Scraper.pl", uses Scraper.pm to investigate the "advanced search page" of a search engine, issue a user specified query, and parse the results. (Scraper.pm can be used by itself to support more elaborate searching Perl scripts.) Scraper.pl and Scraper.pm have enough intelligence to figure out how to interpret the search page and its results.
A simple opcode based language makes describing the results and details pages of new engines easy, and adapting to occasional changes in an existing engine's format simple.
A common Request container makes multiple search engine searches easy to implement, and automatically adapts to changes.
A common Response container makes interpretation of results common among all search engines possible. Also adapts easily to changes.
Post-filtering provides a powerful client-based extension of the search capabilities to all search engines.
Glenn Wood, http://search.cpan.org/search?mode=author&query=GLENNWOOD.
Copyright (C) 2001-2002 Glenn Wood. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install WWW::Scraper, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Scraper
CPAN shell
perl -MCPAN -e shell install WWW::Scraper
For more information on module installation, please visit the detailed CPAN module installation guide.