The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Robots::Validate - Validate that IP addresses are associated with known robots

VERSION

version v0.2.9

SYNOPSIS

  use Robots::Validate;

  my $rv = Robots::Validate->new;

  ...

  if ( $rs->validate( $ip, \%opts ) ) { ...  }

DESCRIPTION

ATTRIBUTES

resolver

This is the Net::DNS::Resolver used for DNS lookups.

robots

This is an array reference of rules with information about robots. Each item is a hash reference with the following keys:

name

The name of the robot.

agent

A regular expression for matching against user agent names.

domain

A regular expression for matching against the hostname.

die_on_error

When true, "validate" will die on a "resolver" failure.

By default it is false.

METHODS

validate

  my $result = $rv->validate( $ip, \%opts );

This method attempts to validate that an IP address belongs to a known robot by first looking up the hostname that corresponds to the IP address, and then validating that the hostname resolves to that IP address.

If this succeeds, it then checks if the hostname is associated with a known web robot.

If that succeeds, it returns a copy of the matched rule from "robots".

You can specify the following %opts:

agent

This is the user-agent string. If it does not match, then the DNS lookups will not be performed.

It is optional.

Alternatively, you can pass in a Plack environment:

  my $result = $rv->validate($env);

KNOWN ISSUES

Undocumented Rules

Many of these rules are not documented, but have been guessed from web traffic.

Limitations

The current module can only be used for systems that consistently support reverse DNS lookups. This means that it cannot be used to validate some robots from Facebook or Twitter.

SUPPORT FOR OLDER PERL VERSIONS

This module requires Perl v5.14 or later.

Future releases may only support Perl versions released in the last ten years.

SEE ALSO

Verifying Bingbot
Verifying Googlebot
How to check that a robot belongs to Yandex

SOURCE

The development version is on github at https://github.com/robrwo/Robots-Validate and may be cloned from git://github.com/robrwo/Robots-Validate.git

BUGS

Please report any bugs or feature requests on the bugtracker website https://github.com/robrwo/Robots-Validate/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

AUTHOR

Robert Rothenberg <rrwo@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2018-2024 by Robert Rothenberg.

This is free software, licensed under:

  The Artistic License 2.0 (GPL Compatible)