The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Changes for version 0.08013

  • RobotRules
    • Add expiration configuration parameters so that ttl for each robot rules can be configured
    • Add get_pending_robots_txt() and push_pending_robots_txt(). Pending requests are now controled in the Storage::* classes
    • Fix calling API for Storage::Cache, so that it also works for Cache::Memcached::Managed
    • Fix a problem where RobotRules::Cache (which is distributed) couldn't figure out that a robots.txt request has been dispatched.

Documentation

An Extensible, High-Performance Web Crawler Framework

Modules

Yet Another High Performance Web Crawler Framework
Base Class For Various Gungho Objects
Component Base Class For Gungho
Base Class For WWW Authentication
Add Basic Auth To Gungho
Use Cache In Your App
Respect robots.txt
RobotRules Storage Base Class
Cache Storage For RobotRules
DB_File Storage For RobotRules
Base Class To Throttle Requests
Throttle By Number Of Requests
Data::Throttler Based Throttling
Base Class For Gungho Engine
Gungho Engine Using Danga::Socket
IO::Async Engine
POE Engine For Gungho
Gungho Exceptions
Base Class For Gungho Handlers
Write Out Fetched Contents To File
A Handler That Does Nothing
Inline Your Providers And Handlers
Log Base Class For Gungho
Log::Dispatch-Based Log For Gungho
Simple Gungho Log Class
Gungho Plugin Base Class
Keep Track Of Time To Finish Request
Base Class For Gungho Prividers
Provide Requests From A Simple File
An In-Memory, Simple Provider
Specify requests in YAML format
A Gungho Request Object
HTTP specific utilities

Provides

in lib/Gungho/Engine/IO/Async.pm
in lib/Gungho/Inline.pm
in lib/Gungho/Inline.pm