Microsoft Azure Data Engineer Associate - DP-203

Get started today

Ultimate access to all questions.

You are developing a batch processing solution using Azure Data Lake Storage Gen2 and Azure Databricks. The solution must handle a large dataset of over 50 TB. Describe the steps you would take to ensure efficient data processing and storage management. Consider aspects such as data partitioning, storage optimization, and job scheduling.

Simulated

Use Azure Data Lake Storage Gen2 for raw data storage and Azure Databricks for processing. Partition data by date and use Azure Data Factory for job scheduling.

100.0%

Use Azure Blob Storage for raw data storage and Azure Databricks for processing. Do not partition data and use cron jobs for scheduling.

Comments

Loading comments...

Use Azure Data Lake Storage Gen2 for raw data storage and Azure Databricks for processing. Partition data by date and use Azure Databricks for job scheduling.

Use Azure Blob Storage for raw data storage and Azure Databricks for processing. Partition data by date and use Azure Data Factory for job scheduling.