NAME
Mail::SpamCannibal::ParseMessage - parse mail headers
SYNOPSIS
limitread
dispose_of
headers
rfheaders
skiphead
get_MTAs
firstremote
array2string
string2array
)
;
$chars
= limitread(
*H
,\
@lines
,
$limit
);
$rv
= dispose_of(
*H
,
$limit
);
$hdrs
= headers(\
@lines
,\
@headers
);
$hdrs
= rfheaders(\
@lines
,\
@headers
);
$lines
= skiphead(\
@lines
);
$mtas
= get_MTAs(\
@headers
,\
@mtas
);
$from
= firstremote(\
@MTAs
,\
@myhosts
,
$noprivate
);
$string
= array2string(\
@array
,
$begin
,
$end
);
$count
= string2array(
$string
,\
@array
);
DESCRIPTION
Mail::SpamCannibal::ParseMessage provides utilities to parse mail headers and email messages containing mail headers as their message content to find the origination Mail Transfer Agent.
limitread
dispose_of
headers
skiphead
get_MTAs
firstremote
array2string
string2array
)
;
# example of reading mail message from STDIN
# read up to 10000 characters
my
@lines
;
exit
unless
limitread(
*STDIN
,\
@lines
,10000);
# release the daemon feeding this script
dispose_of(
*STDIN
);
# optional, if message content is headers
# skip the real headers on this message
exit
unless
skiphead(\
@lines
);
# linearize headers, convert multi-line headers
# to single line, removing extra white space
my
@headers
;
exit
unless
headers(\
@lines
,\
@headers
);
# get list of MTA's from headers
my
@mtas
;
exit
unless
get_MTAs(\
@headers
,\
@mtas
);
# extract the first remote MTA from the
# resulting MTA object
my
@myhosts
=
qw(
mail1.mydomain.com
mail2.mydomain.com
};
my $remoteIP = firstremote(\@mtas,\@myhosts)
;
SUBROUTINE DESCRIPTIONS
$chars = limitread(*H,\@lines,$limit);
Read $limit charcters (or to end of file) from stream *H and place the lines in an array.
This is useful for reading an input stream which could overflow internal buffers if it were not in the expected format.
input:
*H
,
# stream handle
array pointer,
limit
# max characters [1000 default]
returns: number of characters
read
$rv = dispose_of(*H,$limit);
Empty the stream
*H
.... reads
until
EOF and returns
input:
*H
# stream handle
limit
# max buffer size
# default 1000
return
: positive integer
if
anything
read
else
zero
$hdrs = headers(\@lines,\@headers);
Reads lines from array and returns them
in and array of headers. The headers are
unfolded into single lines.
i.e.
Received: from hotmail.com ([64.216.248.129])
by mail.mydoamin.com (8.12.8/8.12.8)
with
SMTP id h2KIRcYC029373;
Thu, 20 Mar 2003 10:27:39 -0800
would be returned as one header line
with
compressed white space
input: pointer to inout line array
pointer to output headder array
returns: number headers
$hdrs = rfheaders(\@lines,\@headers);
Similar in function to "headers" above. Parsing is "dirty" in the sense that extraneous leading characters such as:
>> etc...
are ignored and lines improperly wrapped without leading white space (by your email client) will be added correctly to the header in a manner that can be parsed by "get_MTA's"
This method is not a "pure" as just using "headers", but it also does not require properly formated header text with no leading spaces or characters.
input: pointer to inout line array
pointer to output headder array
returns: number headers
$lines = skiphead(\@lines,\@discard);
Removes lines from the text array
until
one
or more blank lines are found. Leading blank
lines are removed and the top of the array
is positioned at the first line
with
text.
Optionally, an array of the skipped lines
input: pointer to text lines,
[optional] ptr to skip lines
returns: number of lines remaining
$mtas = get_MTAs(\@headers,\@mtas);
Return an array pointing to a structure of
"Received: from"
MTA's found in header lines.
each
array entry ->{from} = IP addr;
|--->{by} = host or IP;
input: pointer to header array
returns: number of MTA entries
$from = firstremote(\@MTAs,\@myhosts,$noprivate);
Parse the
"Received: from"
structure
for
the first
remote MTA address that is not in
@myhosts
or is
not part of a private network where:
@myhosts
= (
'12.34.56.78'
,
# a dot.quad address
'12.34.56.0/28'
,
# a net block
'mail.mydomain.com'
,
# a domain name
'etc...'
,
}
The IP addresses of
"named"
hosts will be resolved
for
multiple interfaces. If you
do
not want this behavior
The private networks listed below are automatically
included in
@myhosts
by
default
. If you
do
not want
this behavior, set
$noprivate
TRUE.
127./8, 10./8, 172.16/12, 192.168/16
input: pointer to
"Received: from"
structure,
pointer to array of
local
host names,
[optional]
no
private nets = TRUE
returns: ip address of first
"from"
remote host
or and
'empty'
character [
''
]
if
the
remote host can not be determined.
$end = trimmsg(\%MAILFILTER,\@lines)
If message length is limited by configuration of MAXMSG, remove duplicate blank lines and return the $end pointer for further processing
input: pointer to MAILFILTER hash,
pointer to
@lines
array
returns: ending line number
$string = array2string(\@array,$begin,$end);
Makes a string from the array elements beginning with $begin and ending with $end. If $begin is undefined, 0 is assumed. If $end is undefined, $#array is assumed. An empty string is returned if $begin > $end.
Unlike a 'join', 'array2string' adds an endline to the 'end' of the string in this manner:
$string
=
join
(
"\n"
,
@array
,
""
);
input: pointer to array of lines
returns: string;
$count = string2array($string,\@array);
Convert a string into an array of separate lines. Surpresses multiple trailing blank lines. Considers a dangling line to be complete.
i.e. "once upon a
time
there were three"
is the same as:
"once upon a
time
there were three
"
input: string or string pointer,
pointer to array
returns: line count
DEPENDENCIES
NetAddr::IP::Lite version 0.02
EXPORT
none
EXPORT_OK
limitread
dispose_of
headers
rfheaders
skiphead
get_MTAs
firstremote
array2string
string2array
trimmsg
AUTHOR
Michael Robinton <michael@bizsystems.com>
COPYRIGHT
Copyright 2003 - 2007, Michael Robinton <michael@bizsystems.com> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
SEE ALSO
perl(1)