NAME
Apache::Log::Parser - Parser for Apache Log (common, combined, and any other custom styles by LogFormat).
SYNOPSIS
my
$parser
= Apache::Log::Parser->new(
fast
=> 1 );
my
$log
=
$parser
->parse(
$logline
);
$log
->{rhost};
#=> remote host
$log
->{agent};
#=> user agent
DESCRIPTION
Apache::Log::Parser is a parser module for Apache logs, accepts 'common', 'combined', and any other custom style. It works relatively fast, and process quoted double-quotation properly.
Once instanciate a parser, it can parse all of types specified with one method 'parse'.
USAGE
This module requires a option 'fast' or 'strict' with instanciate.
'fast' parser works relatively fast. It can process only 'common', 'combined' and custom styles with compatibility with 'common', and cannot work with backslash-quoted double-quotes in fields.
# Default, for both of 'combined' and 'common'
my
$parser
= Apache::Log::Parser->new(
fast
=> 1 );
my
$log1
=
$parser
->parse(
<<COMBINED);
192.168.0.1 - - [07/Feb/2011:10:59:59 +0900] "GET /path/to/file.html HTTP/1.1" 200 9891 "-" "DoCoMo/2.0 P03B(c500;TB;W24H16)"
COMBINED
# $log1->{rhost}, $log1->{date}, $log1->{path}, $log1->{referer}, $log1->{agent}, ...
my
$log2
=
$parser
->parse(
<<COMMON); # parsed as 'common'
192.168.0.1 - - [07/Feb/2011:10:59:59 +0900] "GET /path/to/file.html HTTP/1.1" 200 9891
COMMON
# For custom style(additional fields after 'common'), 'combined' and common
# custom style: LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%v\" \"%{cookie}n\" %D"
my
$c_parser
= Apache::Log::Parser->new(
fast
=> [[
qw(referer agent vhost usertrack request_duration)
],
'combined'
,
'common'
] );
my
$log3
=
$c_parser
->parse(
<<CUSTOM);
192.168.0.1 - - [07/Feb/2011:10:59:59 +0900] "GET /index.html HTTP/1.1" 200 257 "http://example.com/referrer" "Any User-Agent" "example.com" "192.168.0.1201102091208001" 901
CUSTOM
# $log3->{agent}, $log3->{vhost}, $log3->{usertrack}, ...
'strict' parser works relatively slow. It can process any style format logs, with specification about separator, and checker for perfection. It can also process backslash-quoted double-quotes properly.
# 'strict' parser is available for log formats without compatibility for 'common', like 'vhost_common' ("%v %h %l %u %t \"%r\" %>s %b")
my
@customized_fields
=
qw( rhost logname user datetime request status bytes referer agent vhost usertrack request_duration )
;
my
$strict_parser
= Apache::Log::Parser->new(
strict
=> [
[
"\t"
, \
@customized_fields
,
sub
{
my
$x
=
shift
;
defined
(
$x
->{vhost}) and
defined
(
$x
->{usertrack}) }],
# TABs as separator
[
" "
, \
@customized_fields
,
sub
{
my
$x
=
shift
;
defined
(
$x
->{vhost}) and
defined
(
$x
->{usertrack}) }],
'combined'
,
'common'
,
'vhost_common'
,
]);
my
$log4
=
$strict_parser
->parse(
<<CUSTOM);
192.168.0.1 - - [07/Feb/2011:10:59:59 +0900] "GET /index.html HTTP/1.1" 200 257 "http://example.com/referrer" "Any \"Quoted\" User-Agent" "example.com" "192.168.0.1201102091208001" 901
CUSTOM
$log4
->{agent}
#=> 'Any "Quoted" User-Agent'
my
$log5
=
$strict_parser
->parse(<<VHOST);
example.com 192.168.0.1 - - [07/Feb/2011:10:59:59 +0900]
"GET /index.html HTTP/1.1"
200 257
VHOST
LICENSE
This software is licensed under the same terms as Perl itself.
AUTHOR
TAGOMORI Satoshi <tagomoris at gmail.com>
SEE ALSO
http://httpd.apache.org/docs/2.2/mod/mod_log_config.html#formats