The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Name

Text::SenseClusters::LabelEvaluation::AssigningLabelUsingHungarianAlgo - Module which uses Hungarian Algorithm for assigning labels to the clusters.

SYNOPSIS

        The following code snippet will show how to use this module.

        # Including the AssigningLabelUsingHungarianAlgo Module.
        use Text::SenseClusters::LabelEvaluation::AssigningLabelUsingHungarianAlgo;
        
        # Defining the matrix which contains the similarity scores for labels and clusters.
        my @mat = ( [ 2, 4, 7 ], [ 3, 9, 5 ], [ 8, 2, 9 ], );

        # Defining the header for these matrix.
        my @topicHeader = ("BillClinton", "TonyBlair", "EhudBarak");
        my @clusterHeader = ("Cluster0", "Cluster1", "Cluster2");
        
        # Uncomment these to test unbalanced scenarios where number of cluster and labels are different.
        # Test Case 2:  
        #my @mat = ( [ 7, 1, 6, 8, 4 ], [ 8, 6, 5, 9, 8 ], [ 7, 6, 5, 8, 2 ], );
        #my @topicHeader = ("BillClinton", "TonyBlair", "EhudBarak", "SaddamHussien", "VladmirPutin");
        #my @clusterHeader = ("Cluster0", "Cluster1", "Cluster2");
        
        # Test Case 3:  
        #my @mat = ( [ 7, 1, 6 ], [ 8, 6, 5 ], [ 7, 6, 5 ], [ 8, 9, 8 ], [ 1, 0, 1 ]);
        #my @topicHeader = ("BillClinton", "TonyBlair", "SaddamHussien");
        #my @clusterHeader = ("Cluster0", "Cluster1", "Cluster2", "Cluster3", "Cluster4");


        # Creating the Hungarian object.
        my $hungarainObject = Text::SenseClusters::LabelEvaluation::AssigningLabelUsingHungarianAlgo
                                                ->new(\@mat, \@topicHeader, \@clusterHeader);

        # Assigning the labels to clusters using Hungarian algorithm.
        my $accuracy = $hungarainObject->reAssigningWithHungarianAlgo();

        # Assigning the labels to clusters using Hungarian algorithm. In this case,
        # user will get new matrix which contains the mapping between clusters and labels.
        #my ($accuracy,$finalMatrixRef,$newColumnHeaderRef) = 
        #               $hungarainObject->reAssigningWithHungarianAlgo();

        # Following function will just print matrix for you.
        #Text::SenseClusters::LabelEvaluation::AssigningLabelUsingHungarianAlgo::printMatrix 
        #               ($finalMatrixRef, $newColumnHeaderRef, \@clusterHeader);

        print "\n\nAccuracy of labels is $accuracy. ";
        print "\n";

DESCRIPTION

This module assign labels for the clusters using the hungarian algorithm.

Please refer the following for detailed explaination of hungarian algorithm: http://search.cpan.org/~tpederse/Algorithm-Munkres-0.08/lib/Algorithm/Munkres.pm

Constructor: new()

This is the constructor which will create object for this class. Reference : http://perldoc.perl.org/perlobj.html

This constructor takes these argument and intialize it for the class: 1. Matrix : This is the two dimensional array, containing the similarity score. We will take the inverse of these scores for hungarian algorithm. As the Hungarian algorithm, uses the minimum scores in assignment(as diagonal score) while we need the maximum scores for the assignment.

        2. Column Header:
                        This is 1D array, which contains the header information for each
                        Column.
                        
        2. Row Header:
                        This is 1D array, which contains the header information for each
                        Row.
                                        

function: reAssigningWithHungarianAlgo

This method will assign the labels to each cluster using the Hugarian Algorithm. While assigning the labels it will consider the similarity score of these labels with the gold standard keys.

@argument : $hungrarianObject DataType(Reference of the object of this class)

@return : $accuracy : DataType(Float) Indicates the overall accuracy of the assignments.

OR

@return : $accuracy : DataType(Float) Indicates the overall accuracy of the assignments. \@final : DataType(Reference of 2-D Array.) Reference of two dimensional array whose diagonal values contains the similarity score for clusters labels and gold standard keys. \@newColumnHeader: DataType(Reference of 1-D Array.) Reference to new order of the column headers which corresponds to changed diagonal elements.

@description : 1). It will read the Matrix contianing the similarity score of each cluster labels and gold keys data. 2). It will than call a function which will inverse the similarity scores. 3). Then, it will call the 'assign' function from the "Algorithm::Munkres" with this similarity scores. 4). It will calculate the accuracy for the assignment as

                                                                Sum (Diagonal Scores)
                                Accuracy =       -------------------------
                                                                Sum (All the Scores)
        5). Finally, the new arrangement is used to determine the new headers for
                each column. 
        

function: inverseMatrixCellValue

Method will inverse the value of the cell of the input matrix.

@argument : $matRef : DataType(Reference of the 2-D Matrix) This is 2-D array containing the integeral values which will be inversed.

@return : $inverseMatrixRef : DataType(Reference of the 2-D Matrix) This is 2-D array containing the inversed values for the input 2-D array.

@description : 1). For the input 2-D array containing the array, each value is inversed and store in the new 2-D array

                                                                        1
                                New-value = -------------------
                                                                Original-Value
                                                                
        2). If the Original-Value = 0, New-value = 0. 
        

function: printMatrix

Method will print the content of 2-D array in the matrix format.

@argument1 : $matRef : DataType(Reference of the 2-D Array) This is 2-D array which has to be printed in the matrix format. @argument2 : $colHeaderRef : DataType(Reference of the 1-D array) Reference to array containing header info for columns @argument3 : $rowHeaderRef : DataType(Reference of the 1-D array) Reference to array containing header info for rows.

@description : 1. Method for printing the matrix. If user provide his/her own headers then this method will use it, otherwise this method will present default headers.

SEE ALSO

http://senseclusters.cvs.sourceforge.net/viewvc/senseclusters/LabelEvaluation/

Last modified by : $Id: AssigningLabelUsingHungarianAlgo.pm,v 1.5 2013/03/07 23:19:41 jhaxx030 Exp $

AUTHORS

        Anand Jha, University of Minnesota, Duluth
        jhaxx030 at d.umn.edu

        Ted Pedersen, University of Minnesota, Duluth
        tpederse at d.umn.edu

COPYRIGHT AND LICENSE

Copyright (C) 2012,2013 Ted Pedersen, Anand Jha

See http://dev.perl.org/licenses/ for more information.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to:

        The Free Software Foundation, Inc., 59 Temple Place, Suite 330, 
        Boston, MA  02111-1307  USA