
Answer-first summary for fast verification
Answer: HyperLogLog (HLL)
The question asks for a function that can estimate the approximate number of distinct values in a table with trillions of rows. HyperLogLog (HLL) is specifically designed for this purpose, as it uses a probabilistic algorithm to provide cardinality estimates with high efficiency and low memory usage, even for massive datasets. The community discussion confirms this with 100% consensus on option D, citing Snowflake's official documentation that HLL returns an approximation of COUNT(DISTINCT ...) and is suitable for estimating distinct cardinalities of trillions of rows with minimal error. Other options are incorrect: MD5 is a hash function for data integrity, Window functions perform calculations over rows related to the current row, and External refers to external tables, none of which are used for distinct value estimation.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.