
Ultimate access to all questions.
A data engineering team is automating a Spark SQL query to compile monthly sales data from a table named in the format monthly_sales_YYYYMM, where YYYYMM represents the year and month. The query must automatically target the previous month's data. For example, if it's March 2023, the query should access monthly_sales_202302. The standard query is: SELECT product_category, SUM(sales) FROM monthly_sales_YYYYMM GROUP BY product_category;. What strategy should the team use to ensure the query dynamically adjusts to the previous month?
A
Propose altering the query‘s execution frequency to quarterly, thereby reducing the need for monthly table name updates.
B
Manually update the table name in the query to reflect the previous month‘s data before executing it each time.
C
Develop a PySpark script that calculates the date for the previous month and dynamically inserts this date into the table name within the query string.
D
Revise the database design to consolidate sales data into a single table with a month column, avoiding the necessity for separate monthly tables.