Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


In the context of optimizing Spark jobs for large datasets, you are tasked with performing multiple aggregations across different columns. Considering the need for efficiency, scalability, and minimal data shuffling, which of the following strategies would you choose as the BEST approach? Choose one option.