
Answer-first summary for fast verification
Answer: If most partitions in the table have less than 1 GB of data
Over-partitioning or incorrect partitioning can significantly degrade performance. Since files cannot be combined or compacted across partition boundaries, tables with many small partitions incur higher storage costs and require scanning more files, leading to slower query performance. A table is likely over-partitioned if most of its partitions contain less than 1GB of data. Reference: [Databricks Documentation on Partitions](https://docs.databricks.com/tables/partitions.html)
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
How can a data engineering team determine if their Delta Lake tables in the Lakehouse are over-partitioned?
A
If the partitioning columns are fields of low cardinality
B
If most partitions in the table have more than 1 GB of data
C
If the number of partitions in the table are too low
D
If most partitions in the table have less than 1 GB of data
E
If the data in the table continues to arrive indefinitely.
No comments yet.