NAME
Set::Similarity::BV - similarity measures for sets using fast bit vectors (BV)
![Set-Similarity-BV](https://travis-ci.org/wollmers/Set-Similarity-BV.png)
![Coverage Status](https://coveralls.io/repos/wollmers/Set-Similarity-BV/badge.png?branch=master)
![Kwalitee Score](http://cpants.cpanauthors.org/dist/Set-Similarity-BV.png)
SYNOPSIS
# object method
my
$dice
= Set::Similarity::BV::Dice->new;
my
$similarity
=
$dice
->similarity(
'af09ff'
,
'9c09cc'
);
# class method
my
$dice
=
'Set::Similarity::BV::Dice'
;
my
$similarity
=
$dice
->similarity(
'af09ff'
,
'9c09cc'
);
DESCRIPTION
This is the base class including mainly helper and convenience methods.
Use one of the child classes:
Overlap coefficient
( A intersect B ) / min(A,B)
Jaccard Index
The Jaccard coefficient measures similarity between sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets
( A intersect B ) / (A union B)
The Tanimoto coefficient is the ratio of the number of features common to both sets to the total number of features, i.e.
( A intersect B ) / ( A + B - ( A intersect B ) ) # the same as Jaccard
The range is 0 to 1 inclusive.
Dice coefficient
The Dice coefficient is the number of features in common to both sets relative to the average size of the total number of features present, i.e.
( A intersect B ) / 0.5 ( A + B ) # the same as sorensen
The weighting factor comes from the 0.5 in the denominator. The range is 0 to 1.
METHODS
All methods can be used as class or object methods.
new
$object
= Set::Similarity::BV->new();
similarity
my
$similarity
=
$object
->similarity(
$hex1
,
$hex2
);
$hex
is a string of hexadecimal characters.
from_integers
my
$similarity
=
$object
->from_integers(
$AoI1
,
$AoI2
);
Croaks if called directly. This method should be implemented in a child module.
intersection
my
$intersection_size
=
$object
->intersection(
$AoI1
,
$AoI2
);
$AoI
is an array reference of integers. Returns the length of the intersection.
combined_length
my
$set_size_sum
=
$object
->combined_length(
$AoI1
,
$AoI2
);
$AoI
is an array reference of integers.
min
my
$min
=
$object
->min(
$int1
,
$int2
);
bits
my
$bits
=
$object
->bits(
$int
);
Returns the number of bits set in integer.
SEE ALSO
SOURCE REPOSITORY
http://github.com/wollmers/Set-Similarity-BV
AUTHOR
Helmut Wollmersdorfer, <helmut.wollmersdorfer@gmail.com>
![Kwalitee Score](http://cpants.cpanauthors.org/author/wollmers.png)
COPYRIGHT AND LICENSE
Copyright (C) 2016 by Helmut Wollmersdorfer
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.