Counting loading
This algorithm approximates the number of unique items (cardinality) of a multiset. This is achieved by using a hash function that is applied to every element that is to be counted. The algorithm observes the maximum number of leading zeros that occur for all hash values, where intuitively hash values with more leading zeros are less likely and indicate a larger cardinality.
Inputs
Insert 1,000,000 random records with a given cardinality of...
Precision
Smaller will give a worse estimate
Result
Estimate () | {{Math.round(result.estimate, 0)}} * ({{(((( Math.round(result.estimate, 0) - result.actual_cardinality)/result.actual_cardinality) * 100.0)).toFixed(2)}}% error) | |
Registers () | {{result.m}} | The number of registers used. |
Memory used | {{result.m*8}} | bytes |
Calculation
The cardinality () is estimated by the formula
{{result.alpha_m}} | A constant used to correct a systematic multiplicative bias | |
{{Math.pow(result.m, 2.0)}} | The number of registers () squared | |
{{result.z}} | The harmonic mean of the registers |
Corrections
For certain cardinalities, the above calculation yields an incorrect result, so HyperLogLog applies a measure to correct this
Original estimate () | {{result.originalEstimate}} |
Correction Method | {{result.correction_used.name}} (load factor = {{result.correction_used.metadata.FillFactor}}) |
Linear Count
Linear Count is a simple counting algorithm that considers the load factor of the registers. A low load factor indicates the probability of collisions is low, and the cardinality can be estimated by counting the number of registers that are empty (), and applying the following formula
As you can see from the graph below, this gives a fairly good estimate of the cardinality when the registers are not full, but becomes more prone to error as approaches 0.
This is why HyperLogLog only uses LinearCount for smaller cardinalities.
Registers {{result.m}}
There are too many registers to display here, but you can imagine it! Try setting the precision to a lower value if you want to see the registers in action!
Each register represents the maximum number of leading zeroes + 1 seen.
{{value}} |