Web::Scraper::Config - Run Web::Scraper From Config Files
--- scraper: - process: - td>ul>li - trailers[] - scraper: - process_first: - li>b - title - TEXT - process_first: - ul>li>a[href] - url - @href - process: - ul>li>ul>li>a - movies[] - __callback(process_movie)__ my $scraper = Web::Scraper::Config->new( $config, { callbacks => { process_movie => sub { my $elem = shift; return { text => $elem->as_text, href => $elem->attr('href') } } } } ); $scraper->scrape($uri);
Web::Scraper::Config allows you to harness the power of Web::Scraper from a config file.
The config files can be written in any format that Config::Any understands, as long as it conforms to this module's rules.
Creates a new Web::Scraper::Config instance.
The first arguments is either a hashref that represents a config, or a filename to the config. The config file can be in any format that Config::Any understands as long as it returns a hash that's conformant to the Web::Scraper::Config rules.
The second argument (options) is optional, and is currently only used to provider callbacks to be called from the scraper. When Web::Scraper::Config encounters an element in the form of:
__callback(function_name)__
then that is replaced by the corresponding callback specified in the options hash.
Starts scraping. The semantics are exactly the same as Web::Scraper::scrape
Daisuke Maki <daisuke@endeworks.jp>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html
To install Web::Scraper::Config, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Web::Scraper::Config
CPAN shell
perl -MCPAN -e shell install Web::Scraper::Config
For more information on module installation, please visit the detailed CPAN module installation guide.