
Databricks Certified Data Engineer - Professional
Get started today
Ultimate access to all questions.
Your table is often filtered by user_id and session_id, which are highly correlated. What is the best practice according to the doc?
Your table is often filtered by user_id and session_id, which are highly correlated. What is the best practice according to the doc?
Other
Explanation:
If two columns are highly correlated, you only need to include one of them as a clustering key. Reference: https://docs.databricks.com/aws/en/delta/clustering#choose-clustering-keys