The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WWW::Mechanize::Chrome::URLBlacklist - blacklist URLs from fetching

SYNOPSIS

    use WWW::Mechanize::Chrome;
    use WWW::Mechanize::Chrome::URLBlacklist;

    my $mech = WWW::Mechanize::Chrome->new();
    my $bl = WWW::Mechanize::Chrome::URLBlacklist->new(
        blacklist => [
            qr!\bgoogleadservices\b!,
        ],
        whitelist => [
            qr!\bcorion\.net\b!,
        ],

        # fail all unknown URLs
        default => 'failRequest',
        # allow all unknown URLs
        # default => 'continueRequest',

        on_default => sub {
            warn "Ignored URL $_[0] (action was '$_[1]')",
        },
    );
    $bl->enable($mech);

DESCRIPTION

This module allows an easy approach to whitelisting/blacklisting URLs so that Chrome does not make requests to the blacklisted URLs.

ATTRIBUTES

<whitelist>

Arrayref containing regular expressions of URLs to always allow fetching.

<blacklist>

Arrayref containing regular expressions of URLs to always deny fetching unless they are matched by something in the whitelist.

<default>

  default => 'continueRequest'

The action to take if an URL appears neither in the whitelist nor in the blacklist. The default is continueRequest. If you want to block all unknown URLs, use failRequest

<on_default>

  on_default => sub {
      my( $url, $action ) = @_;
      warn "Unknown URL <$url>";
  };

This callback is invoked for every URL that is neither in the whitelist nor in the blacklist. This is useful to see what URLs are still missing a category.

<_mech>

(internal) The WWW::Mechanize::Chrome instance we are connected to

<_request_listener>

(internal) The request listener created by WWW::Mechanize::Chrome while listening for URL messages

METHODS

->new

  my $bl = WWW::Mechanize::Chrome::URLBlacklist->new(
      blacklist => [
          qr!\bgoogleadservices\b!,
          qr!\ioam\.de\b!,
          qr!\burchin\.js$!,
          qr!.*\.(?:woff|ttf)$!,
          qr!.*\.css(\?\w+)?$!,
          qr!.*\.png$!,
          qr!.*\bfavicon.ico$!,
      ],
  );

->enable

  $bl->enable( $mech );

Attaches the blacklist to a WWW::Mechanize::Chrome object.

->enable

  $bl->disable( $mech );

Removes the blacklist to a WWW::Mechanize::Chrome object.

REPOSITORY

The public repository of this module is https://github.com/Corion/www-mechanize-chrome.

SUPPORT

The public support forum of this module is https://perlmonks.org/.

TALKS

I've given a German talk at GPW 2017, see http://act.yapc.eu/gpw2017/talk/7027 and https://corion.net/talks for the slides.

At The Perl Conference 2017 in Amsterdam, I also presented a talk, see http://act.perlconference.org/tpc-2017-amsterdam/talk/7022. The slides for the English presentation at TPCiA 2017 are at https://corion.net/talks/WWW-Mechanize-Chrome/www-mechanize-chrome.en.html.

BUG TRACKER

Please report bugs in this module via the RT CPAN bug queue at https://rt.cpan.org/Public/Dist/Display.html?Name=WWW-Mechanize-Chrome or via mail to www-mechanize-Chrome-Bugs@rt.cpan.org.

AUTHOR

Max Maischein corion@cpan.org

COPYRIGHT (c)

Copyright 2010-2020 by Max Maischein corion@cpan.org.

LICENSE

This module is released under the same terms as Perl itself.