Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


Your table is often filtered by user_id and session_id, which are highly correlated. What is the best practice according to the doc?




Explanation:

If two columns are highly correlated, you only need to include one of them as a clustering key. Reference: https://docs.databricks.com/aws/en/delta/clustering#choose-clustering-keys