++ed by:
Steffen Müller
and 1 contributors

# NAME

Math::SZaru::UniqueEstimator - Statistical estimator for total number of unique items

# SYNOPSIS

``````  use Math::SZaru::UniqueEstimator;
my \$ue = Math::SZaru::UniqueEstimator->new(\$maxelems);
my \$inserted_elems_count = \$ue->tot_elems;
my \$estimated_unique_count = \$ue->estimate();``````

# DESCRIPTION

`Math::SZaru::UniqueEstimator` provides a statistical estimate of the number of unique elements in the input stream.

Quoting the documentation of the SZaru C++ implementation:

``````  The technique used is:
- Convert all elements to unique evenly spaced hash keys.
- Keep track of the smallest N element ("nElemes") of these elements.
- "nelems" cannot glow beyond maxelems.
- Based on the coverage of the space, compute an estimate
of the total number of unique elements, where biggest-small-elem
means largest element among kept "maxelems" elements.

unique = nElemes < maxelems
? nElems
: (maxelems << bits-in-hash) / biggest-small-elem``````

# METHODS

## new

Constructor. Expects an integer indicating the total size of the underlying hash table.

Given a string, adds the string to the UniqueEstimator hash table.

Same as `add_elem`, but accepts an arbitrary number of strings to insert into the estimator at once.

## tot_elems

Returns the total count of the number of elements that were added to the estimator.

## estimate

Returns the estimated number of unique elements that were added to the estimator so far. See above for a description of the algorithm.

Math::SZaru

# AUTHOR

Steffen Mueller, <smueller@cpan.org>

``  http://www.apache.org/licenses/LICENSE-2.0``