WWW::Scraper::Yahoo360 - Yahoo 360 blogs old-fashioned crappy scraper
use WWW::Scraper::Yahoo360; my $y360 = WWW::Scraper::Yahoo360->new({ username => 'myusername', password => 'mypassword', }); # Debug what's happening? $WWW::Scraper::Yahoo360::DEBUG = 1; # First you have to login $y360->login() or die "Login failed?"; # High level blog information my $blog_info = $y360->blog_info(); # Gets all the blog posts my $posts = $y360->get_blog_posts(); # Gets all the blog post comments my $comments = $y360->get_blog_comments();
Ignorant web scraper, based on WWW::Mechanize, that connects to your Yahoo 360 account and tries to fetch the blog posts and comments you still have on their service.
If it breaks, well... it's a scraper.
This module is used on the My Opera Community, http://my.opera.com, to import Yahoo 360 existing blogs into My Opera blog service.
new(\%args)
Where \%args is a hashref with username and password of your Yahoo 360 account.
\%args
username
password
This creates a new WWW::Scraper::Yahoo360 object, ready to scrape.
WWW::Scraper::Yahoo360
blog_info([$blog_page])
Fetches high-level blog information for your Yahoo 360 blog. If a $blog_page argument is supplied, the blog information is looked up inside the contents of that scalar. Otherwise it's fetched from the network. $blog_page must contain a full HTML page string.
$blog_page
Returns a hashref with the some/all the following information:
link
Something like: http://blog.360.yahoo.com/blog-<yourusername>
http://blog.360.yahoo.com/blog-<yourusername>
sharing
Most probably public. Could also be friends or friends of friends, but never tried it.
public
friends
friends of friends
count
Number of blog posts in total.
start
First blog post on the frontpage. Should be 1.
end
Last blog post on the frontpage, usually 5.
title
Title of the blog.
blog_main_page()
Fetches the user's main blog page. Returns a string with the HTML page contents. This can be used in blog_info() or get_blog_posts().
blog_info()
get_blog_posts()
blog_page_url($link, $start, $per_page, $count)
Builds the url to fetch a specific blog page.
dump()
Dumps last accessed page content to STDOUT
login()
Logs in to Yahoo service. Returns a scalar that tells you if the login was successful or not.
get_blog_comments(\@posts)
Retrieves all comments in the user's blog. Wants the structure returned by get_blog_posts().
get_blogpost_comments($post)
Retrieves all comments to a single blog post. Wants a single $post entry (hashref): one of the elements returned by get_blog_posts().
$post
get_blog_posts([$blog_page, [%overrides]])
Gets all blog posts by a user. If $blog_page is supplied, it looks for blog posts in that page only.
%overrides can be a set passed to override some of the properties about the blog to be scraped and parsed. To see the list of properties, look at blog_info().
%overrides
Returns an array of hashrefs, each one representing a blog post. Each post (hashref) should have the following keys:
Example:
$y360 = WWW::Scraper::Yahoo360->new({ username => '...' password => '...', }); $y360->login() or die "Failed login"; # Fetch only the first blog post, no matter what my $first_page = $y360->blog_main_page(); my $blog_posts = $y360->get_blog_posts($first_page, count=>1);
comments
Number of comments to this blog post
description
Blog post content
Permanent URL of the blog post
pubDate
Date when the blog post was published, in HTTP::Date format, ex.: Sun, Nov 14 06:20:28 CET.
HTTP::Date
Sun, Nov 14 06:20:28 CET
tags
Comma delimited string of tags (ex.: travel, holiday)
travel, holiday
Title of the blog post
mech()
WWW::Mechanize object accessor.
WWW::Mechanize
parse_date($date_string)
Tries to parse a date from the Yahoo 360 format to a unix timestamp.
None by default.
Cosimo Streppone, <cosimo@cpan.org>
Copyright (C) 2009 by Cosimo Streppone, cosimo@cpan.org
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
To install WWW::Scraper::Yahoo360, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Scraper::Yahoo360
CPAN shell
perl -MCPAN -e shell install WWW::Scraper::Yahoo360
For more information on module installation, please visit the detailed CPAN module installation guide.