Sam Henry
NAME This program calculates the assocation a dataset of term pairs


This utility takes a file of line seperated term pairs as input. The file is of the form: "cui1<>cui2\n" with each line containing a new cui pair. It outputs a line seperated list of association score and term pair of the form: "score<>cui1<>cui2". Each line contains a different cui pair and their score


Usage: [OPTIONS] CUI_LIST_FILE OUTPUT_FILE --measure Assoc_Measure --matrix Matrix_FileName



the input file containing line seperated cui pairs of the form: "cui1<>cui2"


the output file, where each score and cui pair are output of the form: score<>cui1<>cui2


File name containing co-occurrence data in sparse matrix format


A string specifying the association measure to use The measure used to calculate the assocation. Recommended = x2

The package uses the Text::NSP package to do the calculation. The measure included within this package are: 1. Dice Coefficient 2. Fishers exact test - left sided 3. Fishers exact test - right sided 4. Fishers twotailed test - right sided 5. Jaccard Coefficient 6. Log-likelihood ratio 7. Mutual Information 8. Odds Ratio 9. Pointwise Mutual Information 10. Phi Coefficient 11. Pearson's Chi Squared Test 12. Poisson Stirling Measure 13. T-score


Optional command line arguements. These options are identical to Please see for descriptions.


The association between the each concept pair of the input file written to a new line of the output file.


  • Perl (version 5.8.5 or better) -

  • Text::NSP -


      Sam Henry: henryst at 


 Sam Henry, Virginia Commonwealth University
 Bridget T. McInnes, Virginia Commonwealth University 
 Alexander D. McQuilkin, Virginia Commonwealth University


