
Databricks Certified Data Engineer - Professional
Get started today
Ultimate access to all questions.
In the context of optimizing Spark jobs for large datasets, you are tasked with performing multiple aggregations across different columns. Considering the need for efficiency, scalability, and minimal data shuffling, which of the following strategies would you choose as the BEST approach? Choose one option.
In the context of optimizing Spark jobs for large datasets, you are tasked with performing multiple aggregations across different columns. Considering the need for efficiency, scalability, and minimal data shuffling, which of the following strategies would you choose as the BEST approach? Choose one option.
Simulated