
Ultimate access to all questions.
A data analyst has a series of queries in a SQL program. The data analyst wants this program to run every day. They only want the final query in the program to run on Sundays. They ask for help from the data engineering team to complete this task. Which of the following approaches could be used by the data engineering team to complete this task?
A
They could submit a feature request with Databricks to add this functionality.
B
They could wrap the queries using PySpark and use Python's control flow system to determine when to run the final query.
C
They could only run the entire program on Sundays.
D
They could automatically restrict access to the source table in the final query so that it is only accessible on Sundays.
E
They could redesign the data model to separate the data used in the final query into a new table.
Explanation:
Explanation:
Why B is correct:
if statements to execute the final query only when datetime.today().weekday() == 6 (Sunday).Why other options are incorrect:
Implementation example:
from datetime import datetime
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
# Run daily queries
spark.sql("SELECT * FROM daily_table")
spark.sql("SELECT * FROM intermediate_table")
# Only run final query on Sundays
if datetime.today().weekday() == 6: # Sunday
spark.sql("SELECT * FROM final_table")
from datetime import datetime
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
# Run daily queries
spark.sql("SELECT * FROM daily_table")
spark.sql("SELECT * FROM intermediate_table")
# Only run final query on Sundays
if datetime.today().weekday() == 6: # Sunday
spark.sql("SELECT * FROM final_table")