The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Algorithm::Bayesian - Bayesian Spam Filtering Algorithm

SYNOPSIS

    use Algorithm::Bayesian;
    use Tie::Foo;

    my %storage;
    tie %storage, 'Tie:Foo', ...;
    my $b = Algorithm::Bayesian->new(\%storage);

    $b->spam('spamword1', 'spamword2', ...);
    $b->ham('hamword1', 'hamword2', ...);

    my $pr = $b->test('word1', 'word2', ...);

DESCRIPTION

Algorithm::Bayesian provide an easy way to handle Bayesian spam filtering algorithm.

SUBROUTINES/METHODS

new

    my $b = Algorithm::Bayesian->new(\%hash);

Constructor. Simple hash would be fine. You can use Tie::DBI to store data to RDBM, or other key-value storage.

getHam

    my $num = $b->getHam($word);

Get $word count in Ham.

getSpam

    my $num = $b->getSpam($word);

Get $word count in Spam.

ham

    $b->ham(@words);

Train @words as Ham.

spam

    $b->spam(@words);

Train @words as Spam.

test

    my $pr = $b->test(@words);

Calculate the spam probability of @words. The range of $pr will be in 0 to 1.

testWord

    my $pr = $b->testWord($word);

Calculate the spam probability of $word.

The range of $pr will be in 0 to 1. For non-existence word, it will be 0.5.

AUTHOR

Gea-Suan Lin, <gslin at gslin.org>

LICENSE AND COPYRIGHT

Copyright 2010 Gea-Suan Lin.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.