BusyBird::Input::Feed - input BusyBird statuses from RSS/Atom feed
use BusyBird; use BusyBird::Input::Feed; my $input = BusyBird::Input::Feed->new; my $statuses = $input->parse($feed_xml); timeline("feed")->add($statuses); $statuses = $input->parse_file("feed.atom"); timeline("feed")->add($statuses); $statuses = $input->parse_url('https://metacpan.org/feed/recent?f='); timeline("feed")->add($statuses);
BusyBird::Input::Feed converts RSS and Atom feeds into BusyBird status objects.
For convenience, an executable script busybird_input_feed is bundled in this distribution.
The constructor.
Fields in %args are:
%args
use_favicon
If true (or omitted or undef), it tries to use the favicon of the Web site providing the feed as the statuses' icons.
undef
If it's defined and false, it won't use favicon.
user_agent
LWP::UserAgent object for fetching documents.
image_max_num
The maximum number of image URLs extracted from the feed item.
If set to 0, it extracts no images. If set to a negative value, it extracts all image URLs from the feed item.
The extracted image URLs are stored as Twitter Entities in the status's extended_entities field, so that BusyBird will render them. See "extended_entities.media" in BusyBird::Manual::Status for detail.
extended_entities
Convert the given $feed_xml_string into BusyBird $statuses. parse() method is an alias for parse_string().
$feed_xml_string
$statuses
parse()
parse_string()
$feed_xml_string is the XML data to be parsed. It must be a string encoded in UTF-8.
Return value $statuses is an array-ref of BusyBird status objects.
If $feed_xml_string is invalid, it croaks.
Same as parse_string() except parse_file() reads the file named $feed_xml_filename and converts its content.
parse_file()
$feed_xml_filename
Same as parse_string() except parse_url() downloads the feed XML from $feed_xml_url and converts its content.
parse_url()
$feed_xml_url
parse_uri() method is an alias for parse_url().
parse_uri()
The example below uses Parallel::ForkManager to parallelize parse_url() method of BusyBird::Input::Feed. It greatly reduces the total time to download a lot of RSS/Atom feeds.
use strict; use warnings; use Parallel::ForkManager; use BusyBird::Input::Feed; use open qw(:std :encoding(utf8)); my @feeds = ( 'https://metacpan.org/feed/recent?f=', 'http://www.perl.com/pub/atom.xml', 'https://github.com/perl-users-jp/perl-users.jp-htdocs/commits/master.atom', ); my $MAX_PROCESSES = 10; my $pm = Parallel::ForkManager->new($MAX_PROCESSES); my $input = BusyBird::Input::Feed->new; my @statuses = (); $pm->run_on_finish(sub { my ($pid, $exitcode, $id, $signal, $coredump, $statuses) = @_; push @statuses, @$statuses; }); foreach my $feed (@feeds) { $pm->start and next; warn "Start loading $feed\n"; my $statuses = $input->parse_url($feed); warn "End loading $feed\n"; $pm->finish(0, $statuses); } $pm->wait_all_children; foreach my $status (@statuses) { print "$status->{user}{screen_name}: $status->{text}\n"; }
BusyBird
BusyBird::Manual::Status
https://github.com/debug-ito/BusyBird-Input-Feed
Please report bugs and feature requests to my Github issues https://github.com/debug-ito/BusyBird-Input-Feed/issues.
Although I prefer Github, non-Github users can use CPAN RT https://rt.cpan.org/Public/Dist/Display.html?Name=BusyBird-Input-Feed. Please send email to bug-BusyBird-Input-Feed at rt.cpan.org to report bugs if you do not have CPAN RT account.
bug-BusyBird-Input-Feed at rt.cpan.org
Toshio Ito, <toshioito at cpan.org>
<toshioito at cpan.org>
Copyright 2014 Toshio Ito.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install BusyBird::Input::Feed, copy and paste the appropriate command in to your terminal.
cpanm
cpanm BusyBird::Input::Feed
CPAN shell
perl -MCPAN -e shell install BusyBird::Input::Feed
For more information on module installation, please visit the detailed CPAN module installation guide.