The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::Rssfilter::Match::Duplicates - match an RSS item which has been seen before

VERSION

version 0.07

SYNOPSIS

    use App::Rssfilter::Match::Duplicates;

    use Mojo::DOM;
    my $first_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
  <channel>
    <item>
      <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~6/gu7UEWn8onK/is-typing-tiring-your-toes</link>
      <description>type with toes for tighter tarsals</description>
    </item>
    <item>
      <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
      <description>vulcan is here</description>
    </item>
  </channel>
</rss>
End_of_RSS

    my $second_rss = Mojo::DOM->new( <<"End_of_RSS" );
<?xml version="1.0" encoding="UTF-8"?>
<rss>
  <channel>
    <item>
      <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~3/mnej39gJa9E/new-rocket-to-visit-mars-in-60-days</link>
      <description>setting a new speed record</description>
    </item>
    <item>
      <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
      <description>vulcan is here</description>
    </item>
  </channel>
</rss>
End_of_RSS

    print "$_\n" for $first_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );
    print "$_\n" for $second_rss->find( 'item' )->grep( \&App::Rssfilter::Match::Duplicates::match );

    # or with an App::Rssfilter::Rule

    use App::Rssfilter::Rule;
    my $dupe_rule = App::Rssfilter::Rule->new(
        condition => 'Duplicates',
        action    => sub { print shift->to_xml, "\n" },
    );
    $dupe_rule->constrain( $first_rss );
    $dupe_rule->constrain( $second_rss );

    # either way, prints

    # <item>
    #   <link>http://rss.slashdot.org/~r/Slashdot/slashdot/~9/lloek9InU2p/new-planet-discovered-on-far-side-of-sun</link>
    #   <description>vulcan is here</description>
    # </item>

DESCRIPTION

This module will match RSS items if either the GUID or link of the item have been seen previously.

FUNCTIONS

match

    my $item_seen_before = App::Rssfilter::Match::Duplicate::match( $item );

Returns true if $item has a GUID or link which matches a previously-seen GUID or link. Query strings in links and GUIDs will be ignored for the purposes of matching a previous link.

SEE ALSO

AUTHOR

Daniel Holz <dgholz@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Daniel Holz.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.