
Answer-first summary for fast verification
Answer: (storesDF.withColumn("openTimestamp", col("openDate").cast("Timestamp")) .withColumn("month", month(col("openTimestamp"))))
The task is to extract the month from a UNIX timestamp (seconds since epoch) stored as an integer. The correct approach involves converting the integer to a timestamp type before extracting the month. - **Option A** uses `getMonth`, which is not a valid Spark function. - **Option B** uses `substr` on the integer, which would incorrectly treat it as a string and split digits, not convert to a date. - **Option C** casts the integer to `Date`, but Spark interprets numeric casts to Date as days since epoch, not seconds, leading to incorrect dates. - **Option D** correctly casts the integer to `Timestamp` (seconds since epoch) and uses `month()` to extract the month. - **Option E** applies `month()` directly on the integer, which is invalid as `month()` requires a date/timestamp. Only **Option D** correctly handles the conversion and extraction.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Which of the following code blocks correctly returns a DataFrame with a month column containing the integer month value extracted from the openDate column in storesDF?
Note: The openDate column is of integer type and stores UNIX epoch timestamps (seconds since midnight January 1, 1970).
A sample of storesDF is shown below:
storeId openDate
0 1100746394
1 1474410343
2 1116610009
3 1180035265
4 1408024997
storeId openDate
0 1100746394
1 1474410343
2 1116610009
3 1180035265
4 1408024997
A
storesDF.withColumn("month", getMonth(col("openDate")))
B
storesDF.withColumn("month", substr(col("openDate"), 4, 2))
C
(storesDF.withColumn("openDateFormat", col("openDate").cast("Date")) .withColumn("month", month(col("openDateFormat"))))
D
(storesDF.withColumn("openTimestamp", col("openDate").cast("Timestamp")) .withColumn("month", month(col("openTimestamp"))))
E
storesDF.withColumn("month", month(col("openDate")))