The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

IRC::Indexer::Trawl::Bot - Indexing trawler instance

SYNOPSIS

  ## Inside a POE session
  ## 'spawn' returns session ID:  
  my $trawl_sess_id = IRC::Indexer::Trawl::Bot->spawn(
    ## Server address and port:
    Server  => 'irc.cobaltirc.org',
    Port    => 6667,
    
    ## Nickname, defaults to irctrawl$rand:
    Nickname => 'mytrawler',
    
    ## Local address to bind, if needed:
    BindAddr => '1.2.3.4',
    
    ## IPv6 trawler:
    UseIPV6 => 1,
    
    ## Overall timeout for this server
    ## (The IRC component may time out sooner if the socket is bust)
    Timeout => 90,
    
    ## Interval between commands (LIST/LINKS/LUSERS):
    Interval => 5,
    
    ## Verbosity/debugging level:
    Verbose => 0,
    
    ## Optionally use postback interface:
    Postback => $_[SESSION]->postback('trawler_done', $some_tag);    
  );

  ## Using postback:
  sub trawler_done {
    my ($kernel, $heap) = @_[KERNEL, HEAP];
    my $tag     = $_[ARG0]->[0];
    my $trawler = $_[ARG1]->[0];
    my $report = $trawler->report;
    . . .
  }

  ## Or without postbacks:
  
  ## Spawn a bunch of trawlers in a loop
  ## new() and run() both return a trawler object
  my $trawlers;
  for my $server (@servers) {
    $trawlers->{$server} = IRC::Indexer::Trawl::Bot->new(
      server => $server,
    )->run();
  }
  
  ## Check on them later:
  SERVER: for my $server (keys %$trawlers) {
    my $trawl = $trawlers->{$server};
    
    next SERVER unless $trawl->done;
    
    next SERVER if $trawl->failed;
    
    my $report  = $trawl->report;
    my $netname = $report->network;
    my $hash    = $report->netinfo;
    . . . 
  }

DESCRIPTION

A single instance of an IRC::Indexer trawler; this is the bot that forms the backbone of the rest of the IRC::Indexer modules and utilities.

Connects to a specified server, gathers some network information, and disconnects when either all requests appear to be fulfilled or the specified timeout (defaults to 90 seconds) is reached. Uses POE::Component::IRC for an IRC transport.

There are two ways to interact with a running trawler: the object interface or a POE session postback.

When the trawler is finished, $trawl->done() will be boolean true; if there was some error, $trawl->failed() will be true and will contain a scalar string describing the error. See "new" and "run" if you'd like to use the object interface.

If a postback was specified at construction time, the event will be posted when a trawler has finished. $_[ARG1]->[0] will contain the trawler object; $_[ARG0] will be an array reference containing any arguments specified in your 'Postback =>' option after the event name. See "spawn" if you'd like to use the POE interface.

The report() method returns the IRC::Indexer::Report::Server object.

The dump() method returns a hash reference containing network information (or undef if not done); see IRC::Indexer::Report::Server for details. This is the hash returned by "netinfo" in IRC::Indexer::Report::Server

The trawler attempts to be polite, spacing out requests for LINKS, LUSERS, and LIST; you can fine-tune the interval between commands by specifying a different interval at construction.

See IRC::Indexer::Trawl::Forking for an interface-compatible forked trawler instance.

METHODS

new

Construct, but do not "run", a trawler instance.

Use new() when you'd like to create pending trawler instances that will sit around until instructed to "run".

new() can be used to construct trawlers before any POE sessions are initialized (but you lose the ability to use postbacks).

See "SYNOPSIS" for constructor options.

spawn

Construct and immediately run a trawler from within a running POE::Session.

Returns a POE session ID that can be used to post "shutdown" events if needed.

See "SYNOPSIS" for constructor options.

run

Start the trawler session. Returns the trawler object, so you can chain methods thusly:

  my $trawler = IRC::Indexer::Trawl::Bot->new(%opts)->run();

You should only call run() if you're not using the spawn() interface.

spawn() will call run() for you.

trawler_for

Returns the server this trawler was constructed for.

ID

Returns the POE::Session ID of the trawler, if it is running.

Can be used to post a "shutdown", if needed:

  $poe_kernel->post( $trawler->ID, 'shutdown' );

Returns undef if the trawler was constructed via new() but was never run().

failed

If a trawler has encountered an error, failed will return true and contain a string describing the problem.

It's safest to skip failed runs when processing output; if a report object does exist, the reported data is probably incomplete or broken.

done

Returns boolean true if the trawler instance has finished; it may still be "failed" and have an incomplete or nonexistant report.

report

Returns the IRC::Indexer::Report::Server object, from which server information can be retrieved.

Nonexistant until the trawler has been ->run().

dump

Returns the "report" hash if the trawler instance has finished, or undef if not. See IRC::Indexer::Report::Server

Shutting down

The trawler instance will run its own cleanup when the run has completed, but sometimes you may need to shut it down early.

The safest way to shut down your trawler is to post a shutdown event:

  my $sess_id = $trawler->ID();
  if ($sess_id) {
    ## Or call(), if you really must ...
    $poe_kernel->post( $sess_id, 'shutdown' );
  }

AUTHOR

Jon Portnoy <avenj@cobaltirc.org>

http://www.cobaltirc.org