More

This website only touches upon the details of the original paper, since it was released, some improvements have been made such as HyperLogLog++, which

  • Uses 64-bit integers rather than 32-bit
  • Introduces sparse representation for the registers to save memory (rather than having one huge array)
  • Introduces a further set of bias corrections to improve the count at lower cardinalities

Seen out in the wild

Command line tool

I wrote a tool called card that you can use to determine the approximate cardinality of an input (stdin or file), this makes use of the the HyperLogLog++ library written by Clark Duvall

Background reading

Previous research

HyperLogLog builds on the shoulders of: