Databricks Certified Associate Developer for Apache Spark

Databricks Certified Associate Developer for Apache Spark

Get started today

Ultimate access to all questions.


Which of the following code blocks returns a DataFrame with a column named dayOfYear that contains the integer day-of-year value derived from the openDate column in DataFrame storesDF?

Note: The openDate column is of integer type and stores UNIX epoch timestamps (seconds since midnight on January 1, 1970).

A sample of storesDF is shown below:

storeId  openDate
0        1100746394
1        1474410343
2        1116610009
3        1180035265
4        1408024997





Explanation:

The correct answer is A. The openDate column is an integer representing UNIX epoch time in seconds. To extract the day of the year, it must first be converted to a TimestampType using .cast("Timestamp"). The dayofyear function then correctly calculates the day from the timestamp. Option D is incorrect because casting to DateType treats the integer as days (not seconds) since epoch, leading to invalid dates. Options B, C, and E use incorrect syntax, types, or logic for extracting the day of the year from a UNIX timestamp.