
Answer-first summary for fast verification
Answer: Delta Lake automatically collects statistics on the first 32 columns of each table which are leveraged in data skipping based on query filters.
Let's analyze each option: - **A**: Incorrect. Parquet is a columnar storage format, not row-based. Compression benefits from repeated values in columns, not rows. - **B**: Correct. Delta Lake collects statistics on the first 32 columns by default, which are used for data skipping during query execution. - **C**: Incorrect. Views in the Lakehouse (standard views) are logical and do not cache data. Materialized views may cache, but the statement does not specify this. - **D**: Incorrect. Delta Lake allows defining constraints like primary keys, but they are not enforced. Duplicate values can still be inserted. - **E**: Incorrect. Z-order can optimize columns of any data type (e.g., strings, dates), not just numeric values. Only **B** is correct. Despite the question format suggesting two answers, the analysis confirms only **B** is valid.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Which of the following statements accurately describes the relationship between Delta Lake and the Lakehouse architecture?
A
Because Parquet compresses data row by row. strings will only be compressed when a character is repeated multiple times.
B
Delta Lake automatically collects statistics on the first 32 columns of each table which are leveraged in data skipping based on query filters.
C
Views in the Lakehouse maintain a valid cache of the most recent versions of source tables at all times.
D
Primary and foreign key constraints can be leveraged to ensure duplicate values are never entered into a dimension table.
E
Z-order can only be applied to numeric values stored in Delta Lake tables.