Microsoft Fabric Analytics Engineer Associate DP-600

Get started today

Ultimate access to all questions.

As a Microsoft Fabric Analytics Engineer Associate, you are tasked with optimizing the performance of a complex SQL query in a Spark notebook within Azure Databricks. The query involves multiple joins on large tables, and you need to ensure the solution is cost-effective, scalable, and complies with data governance policies. Considering these constraints, which of the following approaches would you choose to significantly improve the query performance? (Choose one option)

Simulated

Rewrite the query to utilize subqueries and temporary tables, ensuring that the temporary tables are persisted in a cost-effective storage layer.

15.4%

Implement the 'cache' command to store the tables involved in the query in memory, taking into account the memory constraints and the size of the tables.

Comments

Loading comments...

Apply the 'broadcast' join optimization to minimize the data shuffle during the join operation, especially when one of the tables is small enough to be broadcasted.

61.5%