NAME
WWW::2ch - scraping of a popular bbs of Japan.
SYNOPSIS
use WWW::2ch;
my $bbs = WWW::2ch->new(url => 'http://live19.2ch.net/ogame/',
cache => '/tmp/www2ch-cache');
$bbs->load_setting;
$bbs->load_subject;
foreach my $dat ($bbs->subject->threads) {
$dat->load;
my $one = $dat->res(1);
print $dat->title . "\n";
print '>>1: ' . $one->body;
foreach my $res ($dat->reslist) {
print $res->resid . ':' . $res->date . "\n";
print $res->body_text . "\n";
}
last;
}
my $bbs = WWW::2ch->new(url => 'http://live19.2ch.net/test/read.cgi/ogame/1140947283/l50',
cache => '/tmp/www2ch-cache');
my $dat = $bbs->subject->thread('1140947283');
$dat->load;
# dat in cash is taken out
my $bbs = WWW::2ch->new(url => 'http://live19.2ch.net/ogame/',
cache => '/home/ko/cpan/my/WWW-2ch/cache');
my $dat = $bbs->recall_dat('1141300600');
# parse dose dat from file
my $bbs = WWW::2ch->new(url => 'http://live19.2ch.net/ogame/',
cache => '/home/ko/cpan/my/WWW-2ch/cache');
open my $fh, "test.dat" or return;
my $data = join('', <$fh>);
close($fh);
my $dat = $bbs->parse_dat($data);
# returns it with raw article data.
$dat->dat;
#plugin load
my $bbs = WWW::2ch->new(url => 'http://example.jp/test/read.cgi/ogame/1140947283/l50',
cache => '/tmp/www2ch-cache',
plugin => 'ExampleJp');
# plugin file load
my $bbs = WWW::2ch->new(url => 'http://example.com/test/read.cgi/ogame/1140947283/l50',
cache => '/tmp/www2ch-cache',
plugin => '/usr/local/www-2ch/lib/ExampleCom.pm');
DESCRIPTION
It is suitable for the scraping of a popular bbs of Japan.
other BBS and the news sites and other sites are also possible by the addition of the plugin for scraping.
Please take care with the flood control to an excessive access.
Method
option
url
set the permalink of top page.
cache
cache directory or Cache module object
plugin
plugin name (default Base)
- encoding
-
encode name of plugin
- load_setting
-
setting is read
- load_subject
-
article list is read
- parse_dat($data[, $subject])
-
parse does $data
- recall_dat($key)
-
recall dat from cache file
SEE ALSO
http://2ch.net/, http://www.monazilla.org/, WWW::2ch::Subject, WWW::2ch::Dat, WWW::2ch::Res
AUTHOR
Kazuhiro Osawa <ko@yappo.ne.jp>
COPYRIGHT AND LICENSE
Copyright (C) 2006 by Kazuhiro Osawa
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.5 or, at your option, any later version of Perl 5 you may have available.
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 166:
You forgot a '=back' before '=head2'
- Around line 184:
'=item' outside of any '=over'