WWW::RobotRules - database of robots.txt-derived permissions
This module parses /robots.txt files as specified in "A Standard for Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters can use the /robots.txt file to forbid conforming robots from accessing parts of their web site. The pars...GAAS/WWW-RobotRules-6.02 - 18 Feb 2012 13:09:13 UTC
WWW::RobotRules::DBIC - Persistent RobotRules which use DBIC.
WWW::RobotRules::DBIC is a subclass of WWW::RobotRules, which use DBIx::Class to store robots.txt info to any RDBMS....IKEBE/WWW-RobotRules-DBIC-0.01 - 18 Oct 2006 13:58:41 UTC
WWW::RobotRules::Extended - database of robots.txt-derived permissions. This is a fork of WWW::RobotRules
This module parses /robots.txt files as specified in "A Standard for Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> It also parses rules that contains wildcards '*' and allow directives like Google does. Webmasters can use the /robo...YSIMONX/WWW-RobotRules-Extended-0.02 - 14 Jan 2012 10:23:47 UTC
WWW::RobotRules::Parser - Just Parse robots.txt
WWW::RobotRules::Parser allows you to simply parse robots.txt files as described in http://www.robotstxt.org/wc/norobots.html. Unlike WWW::RobotRules (which is very cool), this module does not take into consideration your user agent name when parsing...DMAKI/WWW-RobotRules-Parser-0.04001 - 01 Dec 2007 13:33:54 UTC
WWW::RobotRules::Memcache - Use memcached in conjunction with WWW::RobotRules
This is a subclass of WWW::RobotRules that uses Cache::Memcache to implement persistent caching of robots.txt and host visit information....SOCK/WWW-RobotRules-Memcache-0.1 - 08 Sep 2006 02:02:39 UTC
WWW::RobotRules::AnyDBM_File - Persistent RobotRules
This is a subclass of *WWW::RobotRules* that uses the AnyDBM_File package to implement persistent diskcaching of robots.txt and host visit information. The constructor (the new() method) takes an extra argument specifying the name of the DBM file to ...GAAS/WWW-RobotRules-6.02 - 18 Feb 2012 13:09:13 UTC
WWW::RobotRules::Parser::MultiValue - Parse robots.txt
"WWW::RobotRules::Parser::MultiValue" is a parser for "robots.txt". Parsed rules for the specified user agent is stored as a Hash::MultiValue, where the key is a lower case rule name. "Request-rate" rule is handled specially. It is normalized to "Cra...TARAO/WWW-RobotRules-Parser-MultiValue-0.02 - 12 Mar 2015 05:46:38 UTC
WWW::Mixi - Perl extension for scraping the MIXI social networking service.
WWW::Mixi uses LWP::RobotUA to scrape mixi.jp. This provide login method, get and put method, and some parsing method for user who create mixi spider. I think using WWW::Mixi is better than using LWP::UserAgent or LWP::Simple for accessing Mixi. WWW:...TSUKAMOTO/WWW-Mixi-0.50 - 01 Aug 2007 06:02:56 UTC