Databricks Certified Data Engineer - Associate

Ultimate access to all questions.

In a data engineering project, you are working with a large dataset stored in a Delta table on Databricks. The dataset includes a 'timestamp' column formatted as 'yyyy-MM-dd HH:mm:ss'. Your task is to analyze the data by extracting the year, month, and day from the 'timestamp' column to facilitate time-based analysis. Considering the need for efficiency and correctness in Spark SQL, which of the following queries would you use to create a new table with these extracted values? Choose the best option.

Simulated

SELECT EXTRACT(YEAR FROM timestamp) as year, EXTRACT(MONTH FROM timestamp) as month, EXTRACT(DAY FROM timestamp) as day FROM dataset

45.7%

SELECT year(timestamp) as year, month(timestamp) as month, day(timestamp) as day FROM dataset

Loading comments...