
Answer-first summary for fast verification
Answer: Use only user_id as a clustering key.
If two columns are highly correlated, you only need to include one of them as a clustering key. Reference: https://docs.databricks.com/aws/en/delta/clustering#choose-clustering-keys
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your table is often filtered by user_id and session_id, which are highly correlated. What is the best practice according to the doc?
A
Use both user_id and session_id as clustering keys.
B
Use only user_id as a clustering key.
C
Use only session_id as a clustering key.
D
Use neither.
No comments yet.