Text::NSP::Measures::3D::MI::ll - Perl module that implements Loglikelihood measure of association for trigrams.


Basic Usage

  use Text::NSP::Measures::3D::MI::ll;

  $ll_value = calculateStatistic( n111=>10,

  if( ($errorCode = getErrorCode()))
    print STDERR $erroCode." - ".getErrorMessage()."\n";
    print getStatisticName."value for trigram is ".$ll_value."\n";


The log-likelihood ratio measures the devitation between the observed data and what would be expected if <word1>, <word2> and <word3> were independent. The higher the score, the less evidence there is in favor of concluding that the words are independent.

The expected values for the internal cells are calculated by taking the product of their associated marginals and dividing by the sample size, for example:

            n1pp * np1p * npp1
   m111=   --------------------

Then the deviation between observed and expected values for each internal cell is computed to arrive at the log-likelihood value.

 Log-Likelihood = 2 * [n111 * log(n111/m111) + n112 * log(n112/m112) +
           n121 * log(n121/m121) + n122 * log(n122/m122) +
           n211 * log(n211/m211) + n212 * log(n212/m212) +
           n221 * log(n221/m221) + n222 * log(n222/m222)]
calculateStatistic($count_values) - This method calculates the ll value

INPUT PARAMS : $count_values .. Reference of an hash containing the count values computed by the program.

RETURN VALUES : $loglikelihood .. Loglikelihood value for this trigram.

getStatisticName() - Returns the name of this statistic


RETURN VALUES : $name .. Name of the measure.


Ted Pedersen, University of Minnesota Duluth <>

Satanjeev Banerjee, Carnegie Mellon University <>

Amruta Purandare, University of Pittsburgh <>

Bridget Thomson-McInnes, University of Minnesota Twin Cities <>

Saiyam Kohli, University of Minnesota Duluth <>


