The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::IRCLog - parse internet relay chat logs

VERSION

version 1.104

SYNOPSIS

  use Parse::IRCLog;

  $result = Parse::IRCLog->parse("perl-2004-02-01.log");

  my %to_print = ( msg => 1, action => 1 );

  for ($result->events) {
    next unless $to_print{ $_->{type} };
    print "$_->{nick}: $_->{text}\n";
  }

DESCRIPTION

This module provides a simple framework to parse IRC logs in arbitrary formats.

A parser has a set of regular expressions for matching different events that occur in an IRC log, such as "msg" and "action" events. Each line in the log is matched against these rules and a result object, representing the event stream, is returned.

The rule set, described in greated detail below, can be customized by subclassing Parse::IRCLog. In this way, Parse::IRCLog can provide a generic interface for log analysis across many log formats, including custom formats.

Normally, the parse method is used to create a result set without storing a parser object, but a parser may be created and reused.

METHODS

new

This method constructs a new parser (with <$class-construct>>) and initializes it (with <$obj-init>>). Construction and initialization are separated for ease of subclassing initialization for future pipe dreams like guessing what ruleset to use.

construct

The parser constructor just returns a new, empty parser object. It should be a blessed hashref.

init

The initialization method configures the object, loading its ruleset.

patterns

This method returns a reference to a hash of regular expressions, which are used to parse the logs. Only a few, so far, are required by the parser, although internally a few more are used to break down the task of parsing lines.

action matches an action; that is, the result of /ME in IRC. It should return the following matches:

 $1 - timestamp
 $2 - nick prefix
 $3 - nick
 $4 - the action

msg matches a message; that is, the result of /MSG (or "normal talking") in IRC. It should return the following matches:

 $1 - timestamp
 $2 - nick prefix
 $3 - nick
 $3 - channel
 $5 - the action

Read the source for a better idea as to how these regexps break down. Oh, and for what it's worth, the default patterns are based on my boring, default irssi configuration. Expect more rulesets to be included in future distributions.

parse($file)

This method parses the file named and returns a Parse::IRCLog::Result object representing the results. The parse method can be called on a parser object or on the class. If called on the class, a parser will be instantiated for the method call and discarded when parse returns.

parse_line($line)

This method is used internally by parse to turn each line into an event. While it could someday be made slick, it's adequate for now. It attempts to match each line against the required patterns from the patterns result and if successful returns a hashref describing the event.

If no match can be found, an "unknown" event is returned.

TODO

Write a few example subclasses for common log formats.

Add a few more default event types: join, part, nick. Others?

Possibly make the patterns sub an module, to allow subclassing to override only one or two patterns. For example, to use the default nick pattern but override the nick_container or action_leader. This sounds like a very good idea, actually, now that I write it down.

AUTHOR

Ricardo SIGNES <rjbs@cpan.org>

COPYRIGHT

Copyright 2004 by Ricardo Signes.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.