The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

TREE::FP - Perl implementation of the FP-Tree

SYNOPSIS

        use Tree::FP;
        
        $fptree = Tree::FP->new('a','b','c');
        
        $insert_successful = $fptree->insert_tree('c','a','b');
        
        $decimal_support = $fptree->support;
        $fptree->set_support(0.3);
        
        $decimal_confidence = $fptree->confidence;
        $fptree->set_confidence(0.25);
        
        @rules = $fptree->association_rules;
        
        $fptree->reset_tree;
        
        $error_string = $fptree->err;
        

DESCRIPTION

Tree:FP is a Perl implmentation of the FP-Tree based association rule mining algorithm (association rules == market basket analysis). For a detailed explanation, see "Mining Frequent Patterns without Candidate Generation" by Jiawei Han, Jian Pei, and Yiwen Yin, 2000. Contrarywise, most books on data mining will have information on this algorithm.

The short version is this: instead of generating a huge number of candidate sets for the apriori algorithm and requiring multiple database scans, compress information into a new data structure, a Frequent Pattern (or FP) tree, then mine the tree.

VERSION

        0.04

METHODS

new( LIST )

The new method is called with a list of frequent items from the transactional database listed in descending support order. Given the following DB =head1

        Item | Count
        ------------
        itm1 | 2
        itm2 | 4
        itm3 | 3
        itm4 | 5
        itm5 | 3

The code for creating a new FP-Tree would be: $fptree = Tree::FP->new('itm4','itm2','itm3','itm5','itm1');

NOTE: The list can also be of integers, which is the more likely scenario for most TDBs.

insert_tree( LIST )

This is the method used to populate the FP-Tree. The list consists of all items for one transaction. The items need NOT be in any order. The method returns 0 (false) is an error occurred, and 1 (true) otherwise.

Example:

        $fptree->insert_tree('itm1','itm2','itm3');

support

Returns the current minimum percentage support for the FP-tree, expressed as a decimal. 10% = 0.1

set_support( FLOAT )

Sets the current minimum percentage support for the FP-tree.

confidence

Returns the current minimum percentage confidence for the FP-tree, expressed as a decimal. 10% = 0.1 NOTE: Currently this method has no effect on performance of the FP-Tree. Future versions may allow result filtering based on confidence.

set_confidence( FLOAT )

Sets the current minimum percentage confidence for the FP-tree.

association_rules

Returns list of assocation rules for the FP-Tree meeting minimum support, listed in descending order of confidence. Each element of the list is actually an FP_Tree_association_rule object with the following four methods:

        left - returns left side of association rule (a "pattern") as a list, each element of which is an item
        right - returns right side of association rule (a "pattern") as a list, each element of which is an item
        support - support for the rule
        confidence - confidence for the rule

Example:

        @rules = $fptree->association_rules;
        @left = $rules[0]->left;
        @right = $rules[0]->right;
        $support = $rules[0]->support;
        $confidence = $rules[0]->confidence;

err

Returns the last error that occurred in the FP-Tree. In general, if any of the methods returns false (0 or undef), this method will provide details as to what went wrong.

NOTES

This package includes three other packages, namely FP_Tree_node, FP_Tree_header_node, and FP_Tree_association_rule. Outside of the context of an FP-Tree, it is not likely that they have much utility, however feel free to use these. Also, there is a combinations function in this package that can be used for finding all combinations of elements of an array (but this must be explicitly exported).

AUTHOR

Martin Paczynski, nitram@cpan.org.

COPYRIGHT

Copyright 2003, Martin Paczynski, nitram@cpan.org, all rights reserved.

This package is free software and is provided "as is" without express or implied warranty. It may be used, redistributed and/or modified under the same terms as Perl itself.