Using HyperLogLog (HLL) hashes for count approximations

HyperLogLog is an algorithm to estimate cardinality in extremely large datasets using little memory and time. This simple but extremely powerful algorithm aims to answer a question: How to estimate the number of unique values (aka cardinality) within a very large dataset? This question is called Count-distinct problem in Computer Science or Cardinality Estimation Problem in Applied Mathematics. We will call it Cardinality Estimation Problem in this article because it sounds more impressive.

Background info: see HyperLogLog: A Simple but Powerful Algorithm for Data Scientists

Also Wikipedia

We use PostgreSQL HLL Extension to facilitate approximate COUNT DISTINCT queries. For example, to estimate a number of beneficiaries that ever lived in a certain zip code, or a group of zip codes.