++ed by:

2 PAUSE users
1 non-PAUSE user.

Ken Williams
and 1 contributors


AI::Categorizer::Learner - Abstract Machine Learner Class


 use AI::Categorizer::Learner::NaiveBayes;  # Or other subclass
 # Here $k is an AI::Categorizer::KnowledgeSet object
 my $nb = new AI::Categorizer::Learner::NaiveBayes(...parameters...);
 $nb->train(knowledge_set => $k);
 ... time passes ...
 $nb = AI::Categorizer::Learner::NaiveBayes->restore_state('filename');
 my $c = new AI::Categorizer::Collection::Files( path => ... );
 while (my $document = $c->next) {
   my $hypothesis = $nb->categorize($document);
   print "Best assigned category: ", $hypothesis->best_category, "\n";
   print "All assigned categories: ", join(', ', $hypothesis->categories), "\n";


The AI::Categorizer::Learner class is an abstract class that will never actually be directly used in your code. Instead, you will use a subclass like AI::Categorizer::Learner::NaiveBayes which implements an actual machine learning algorithm.

The general description of the Learner interface is documented here.



Creates a new Learner and returns it. Accepts the following parameters:


A Knowledge Set that will be used by default during the train() method.


If true, the Learner will display some diagnostic output while training and categorizing documents.

train(knowledge_set => $k)

Trains the categorizer. This prepares it for later use in categorizing documents. The knowledge_set parameter must provide an object of the class AI::Categorizer::KnowledgeSet (or a subclass thereof), populated with lots of documents and categories. See AI::Categorizer::KnowledgeSet for the details of how to create such an object. If you provided a knowledge_set parameter to new(), specifying one here will override it.


Returns an AI::Categorizer::Hypothesis object representing the categorizer's "best guess" about which categories the given document should be assigned to. See AI::Categorizer::Hypothesis for more details on how to use this object.

categorize_collection(collection => $collection)

Categorizes every document in a collection and returns an Experiment object representing the results. Note that the Experiment does not contain knowledge of the assigned categories for every document, only a statistical summary of the results.


Gets/sets the internal knowledge_set member. Note that since the knowledge set may be enormous, some Learners may throw away their knowledge set after training or after restoring state from a file.


Saves the Learner for later use. This method is inherited from AI::Categorizer::Storable.


Returns a Learner saved in a file with save_state(). This method is inherited from AI::Categorizer::Storable.


Ken Williams, ken@mathforum.org


Copyright 2000-2003 Ken Williams. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.