
Databricks Certified Data Engineer - Professional
Get started today
Ultimate access to all questions.
In a scenario where you are processing a large dataset using Spark Structured Streaming, you notice that the performance of your queries is not as expected due to over-partitioning. The dataset is expected to grow significantly over time, and you need to ensure that your solution is cost-effective, scalable, and complies with data governance policies. Which of the following strategies would BEST optimize the performance of your queries by addressing the issues caused by over-partitioning, while also considering the future growth of the dataset? Choose one option.
In a scenario where you are processing a large dataset using Spark Structured Streaming, you notice that the performance of your queries is not as expected due to over-partitioning. The dataset is expected to grow significantly over time, and you need to ensure that your solution is cost-effective, scalable, and complies with data governance policies. Which of the following strategies would BEST optimize the performance of your queries by addressing the issues caused by over-partitioning, while also considering the future growth of the dataset? Choose one option.
Simulated