
Ultimate access to all questions.
The data engineering team maintains a table named store_sales_summary (corrected from store_saies_summary) with nightly batch updates containing aggregate statistics, including previous day's total sales along with totals and averages for various time periods (7-day, quarter-to-date, year-to-date). The schema is:
store_id INT,
total_sales_qtd FLOAT,
avg_daily_sales_qtd FLOAT,
total_sales_ytd FLOAT,
avg_daily_sales_ytd FLOAT,
previous_day_sales FLOAT,
total_sales_7d FLOAT,
avg_daily_sales_7d FLOAT,
updated TIMESTAMP
store_id INT,
total_sales_qtd FLOAT,
avg_daily_sales_qtd FLOAT,
total_sales_ytd FLOAT,
avg_daily_sales_ytd FLOAT,
previous_day_sales FLOAT,
total_sales_7d FLOAT,
avg_daily_sales_7d FLOAT,
updated TIMESTAMP
The source table daily_store_sales (schema: store_id INT, sales_date DATE, total_sales FLOAT) is implemented as a Type 1 table where total_sales may be updated after manual auditing. What is the safest approach to ensure accurate reporting in store_sales_summary?
A
Implement the appropriate aggregate logic as a batch read against the daily_store_sales table and overwrite the store_sales_summary table with each Update.
B
Implement the appropriate aggregate logic as a batch read against the daily_store_sales table and append new rows nightly to the store_sales_summary table.
C
Implement the appropriate aggregate logic as a batch read against the daily_store_sales table and use upsert logic to update results in the store_sales_summary table.
D
Implement the appropriate aggregate logic as a Structured Streaming read against the daily_store_sales table and use upsert logic to update results in the store_sales_summary table.
E
Use Structured Streaming to subscribe to the change data feed for daily_store_sales and apply changes to the aggregates in the store_sales_summary table with each update.