The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

BusyBird::Input::Feed - input BusyBird statuses from RSS/Atom feed

SYNOPSIS

    use BusyBird;
    use BusyBird::Input::Feed;
    
    my $input = BusyBird::Input::Feed->new;
    
    my $statuses = $input->parse($feed_xml);
    timeline("feed")->add($statuses);
    
    $statuses = $input->parse_file("feed.atom");
    timeline("feed")->add($statuses);
    
    $statuses = $input->parse_url('https://metacpan.org/feed/recent?f=');
    timeline("feed")->add($statuses);

DESCRIPTION

BusyBird::Input::Feed converts RSS and Atom feeds into BusyBird status objects.

For convenience, an executable script busybird_input_feed is bundled in this distribution.

CLASS METHODS

$input = BusyBird::Input::Feed->new(%args)

The constructor.

Fields in %args are:

use_favicon => BOOL (optional, default: true)

If true (or omitted or undef), it tries to use the favicon of the Web site providing the feed as the statuses' icons.

If it's defined and false, it won't use favicon.

user_agent => LWP::UserAgent object (optional)

LWP::UserAgent object for fetching documents.

image_max_num => INT (optional, default: 3)

The maximum number of image URLs extracted from the feed item.

If set to 0, it extracts no images. If set to a negative value, it extracts all image URLs from the feed item.

The extracted image URLs are stored as Twitter Entities in the status's extended_entities field, so that BusyBird will render them. See "extended_entities.media" in BusyBird::Manual::Status for detail.

OBJECT METHODS

$statuses = $input->parse($feed_xml_string)

$statuses = $input->parse_string($feed_xml_string)

Convert the given $feed_xml_string into BusyBird $statuses. parse() method is an alias for parse_string().

$feed_xml_string is the XML data to be parsed. It must be a string encoded in UTF-8.

Return value $statuses is an array-ref of BusyBird status objects.

If $feed_xml_string is invalid, it croaks.

$statuses = $input->parse_file($feed_xml_filename)

Same as parse_string() except parse_file() reads the file named $feed_xml_filename and converts its content.

$statuses = $input->parse_url($feed_xml_url)

$statuses = $input->parse_uri($feed_xml_url)

Same as parse_string() except parse_url() downloads the feed XML from $feed_xml_url and converts its content.

parse_uri() method is an alias for parse_url().

EXAMPLE

The example below uses Parallel::ForkManager to parallelize parse_url() method of BusyBird::Input::Feed. It greatly reduces the total time to download a lot of RSS/Atom feeds.

    use strict;
    use warnings;
    use Parallel::ForkManager;
    use BusyBird::Input::Feed;
    use open qw(:std :encoding(utf8));
    
    my @feeds = (
        'https://metacpan.org/feed/recent?f=',
        'http://www.perl.com/pub/atom.xml',
        'https://github.com/perl-users-jp/perl-users.jp-htdocs/commits/master.atom',
    );
    my $MAX_PROCESSES = 10;
    my $pm = Parallel::ForkManager->new($MAX_PROCESSES);
    my $input = BusyBird::Input::Feed->new;
    
    my @statuses = ();
    
    $pm->run_on_finish(sub {
        my ($pid, $exitcode, $id, $signal, $coredump, $statuses) = @_;
        push @statuses, @$statuses;
    });
    
    foreach my $feed (@feeds) {
        $pm->start and next;
        warn "Start loading $feed\n";
        my $statuses = $input->parse_url($feed);
        warn "End loading $feed\n";
        $pm->finish(0, $statuses);
    }
    $pm->wait_all_children;
    
    foreach my $status (@statuses) {
        print "$status->{user}{screen_name}: $status->{text}\n";
    }

SEE ALSO

REPOSITORY

https://github.com/debug-ito/BusyBird-Input-Feed

BUGS AND FEATURE REQUESTS

Please report bugs and feature requests to my Github issues https://github.com/debug-ito/BusyBird-Input-Feed/issues.

Although I prefer Github, non-Github users can use CPAN RT https://rt.cpan.org/Public/Dist/Display.html?Name=BusyBird-Input-Feed. Please send email to bug-BusyBird-Input-Feed at rt.cpan.org to report bugs if you do not have CPAN RT account.

AUTHOR

Toshio Ito, <toshioito at cpan.org>

LICENSE AND COPYRIGHT

Copyright 2014 Toshio Ito.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.