
Answer-first summary for fast verification
Answer: Move columns 58 and 59 into the first 32 columns to utilize the default statistics collection for the initial 32 columns.
The optimal approach involves repositioning columns 58 and 59 within the first 32 columns to take advantage of the default statistics collection for these columns, thereby improving query performance without significantly increasing overhead as new records are added. Adjusting 'delta.dataSkippingNumIndexedCols' to include more columns would enable statistics collection but at the cost of higher overhead. Statistics are not collected for all columns by default, making the first option incorrect. Focusing statistics collection on just two columns or repositioning them within the first 16 columns does not fully leverage the default settings for optimal performance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
To enhance query performance on a Delta table named 'voters' where columns 58 and 59 are frequently used for highly selective filters, what strategy should a data engineer employ during table creation, considering the need to minimize overhead as new records are added?
A
Increase the value of 'delta.dataSkippingNumIndexedCols' to 59 to enable statistics collection for the first 59 columns.
B
No action is required since statistics are automatically collected for all columns in a Delta table by default.
C
Reposition columns 58 and 59 within the first 16 columns to leverage automatic statistics collection for these columns.
D
Adjust the 'delta.dataSkippingNumIndexedCols' setting to 2 to focus statistics collection on the two columns in question.
E
Move columns 58 and 59 into the first 32 columns to utilize the default statistics collection for the initial 32 columns.