LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


You are a data engineer working on a project that involves processing a large dataset in Azure Databricks using Spark. The dataset contains sales data from the past five years, with each record including a timestamp, product ID, and sales amount. Your task is to optimize the performance of a batch processing job that aggregates sales by product ID. The dataset is currently skewed, with a few product IDs accounting for a significant portion of the data. Considering the need for efficient resource utilization and minimizing job execution time, which partitioning strategy would you choose to optimize the performance of your job? Choose the best option from the following:

Simulated



Powered ByGPT-5