App::scrape - simple HTML scraping
This is a simple module to extract data from HTML by specifying CSS3 or XPath selectors.
use App::scrape 'scrape'; use LWP::Simple 'get'; use Data::Dumper; my $html = get('http://perlmonks.org'); my @posts = scrape( $html, ['a','a@href'], { absolute => [qw[href src rel]], base => 'http://perlmonks.org', }, ); print Dumper \@posts; my @posts = scrape( $html, { title => 'a', url => 'a@href', }, { absolute => [qw[href src rel]], base => 'http://perlmonks.org', }, ); print Dumper \@posts;
This module implements yet another scraping engine to extract data from HTML.
This engine does not (yet) support nested data structures. For an engine that supports nesting, see Web::Scraper.
Web::Scraper - the scraper inspiring this module
The public repository of this module is http://github.com/Corion/App-scrape.
The public support forum of this program is http://perlmonks.org/.
Max Maischein corion@cpan.org
corion@cpan.org
Copyright 2011-2011 by Max Maischein corion@cpan.org.
This module is released under the same terms as Perl itself.
To install App::scrape, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::scrape
CPAN shell
perl -MCPAN -e shell install App::scrape
For more information on module installation, please visit the detailed CPAN module installation guide.