The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::Similarity::BV::Jaccard - Jaccard coefficent for sets

SYNOPSIS

 use Set::Similarity::BV::Jaccard;

 my $jaccard = Set::Similarity::BV::Jaccard->new;
 my $similarity = $jaccard->similarity('af09ff','9c09cc');

DESCRIPTION

Jaccard Index

The Jaccard coefficient measures similarity between sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets

( A intersect B ) / (A union B)

The Tanimoto coefficient is the ratio of the number of elements common to both sets to the total number of elements, i.e.

( A intersect B ) / ( A + B - ( A intersect B ) ) # the same as Jaccard

The range is 0 to 1 inclusive.

METHODS

Set::Similarity::BV::Jaccard inherits all methods from Set::Similarity::BV and implements the following new ones.

from_integers

  my $similarity = $object->from_integers($AoI1,$AoI2);

This method expects two array references of integers as parameters. The parameters are not checked, thus can lead to funny results or uncatched divisions by zero.

If you want to use this method directly, you should catch the situation where one of the parameters is empty (similarity is 0), or both are empty (similarity is 1).

SOURCE REPOSITORY

http://github.com/wollmers/Set-Similarity-BV

AUTHOR

Helmut Wollmersdorfer, <helmut.wollmersdorfer@gmail.com>

COPYRIGHT AND LICENSE

Copyright (C) 2016 by Helmut Wollmersdorfer

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.