
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate, you are optimizing a complex SQL query in a Spark notebook within Azure Databricks. The query involves multiple window functions and is experiencing performance issues. Considering the need for cost efficiency, compliance with data governance policies, and scalability, which of the following approaches would BEST improve the query's performance? (Choose one option.)
A
Rewrite the query to use subqueries and temporary tables, ensuring that the temporary tables are created with appropriate partitioning to leverage parallel processing.
B
Use the 'cache' command to store the tables involved in the query in memory, and apply dynamic filtering to reduce the dataset size before processing.
C
Add more indexes to the tables involved in the query, focusing on columns used in the window functions to speed up data access.
D
Use the 'repartition' command to redistribute the data evenly across the cluster before applying the window functions, ensuring optimal resource utilization and minimizing data shuffling.