
Ultimate access to all questions.
In a Databricks environment, you are working with a large dataset that includes a 'transaction_date' column formatted as 'yyyy-MM-dd'. Your task is to analyze transactions by year, month, and day. To achieve this, you need to cast the 'transaction_date' column to a timestamp and then extract the year, month, and day from the resulting timestamp. Considering the need for accuracy and performance in processing large datasets, which of the following Spark SQL queries correctly accomplishes this task? Choose the best option from the four provided.
A
SELECT CAST(transaction_date AS TIMESTAMP), EXTRACT(YEAR FROM transaction_date), EXTRACT(MONTH FROM transaction_date), EXTRACT(DAY FROM transaction_date) FROM dataset
B
SELECT transaction_date, YEAR(transaction_date), MONTH(transaction_date), DAY(transaction_date) FROM dataset
C
SELECT CAST(transaction_date AS TIMESTAMP) as ts, EXTRACT(YEAR FROM ts) as year, EXTRACT(MONTH FROM ts) as month, EXTRACT(DAY FROM ts) as day FROM dataset_
D
SELECT FROM_UNIXTIME(CAST(transaction_date AS TIMESTAMP), 'yyyy-MM-dd') as formatted_date, EXTRACT(YEAR FROM formatted_date) as year FROM dataset