
Answer-first summary for fast verification
Answer: Create a new Delta Lake table that pre-aggregates the sales data on a daily basis and query this table for the weekly report.
Creating a new Delta Lake table that pre-aggregates the sales data daily mirrors the concept of a materialized view, significantly reducing the time needed to generate the weekly report by avoiding full dataset processing each time. Scheduled refreshes (e.g., daily) via Databricks Jobs ensure the report includes up-to-date information while optimizing query performance, aligning with Databricks data engineering best practices by leveraging Delta Lake's capabilities for efficient large dataset management and processing.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A retail company's data engineering team utilizes Databricks to analyze sales data, facing delays due to the complexity and time consumption of generating a weekly sales performance report across various regions from millions of transactions in Delta Lake. To mirror the concept of a materialized view within Databricks for efficiency, which strategy should they adopt?
A
Increase the compute resources of the Databricks cluster before running the weekly report to reduce execution time.
B
Directly query the transactional data each time the report is needed, ensuring the most up-to-date information is used.
C
Create a new Delta Lake table that pre-aggregates the sales data on a daily basis and query this table for the weekly report.
D
Store the query results in a blob storage and update it manually every month to ensure data freshness.
No comments yet.