Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
You are given a Spark DataFrame 'df' with a date column 'date' in the format 'yyyy-MM-dd'. Write a code snippet that extracts the year, month, and day from the 'date' column and creates new columns for each component, and explain the steps involved.
A
from pyspark.sql.functions import year, month, day
df = df.withColumn('year', year('date')).withColumn('month', month('date')).withColumn('day', day('date'))
print(A)
B
df = df.withColumn('year', df.date.substr(0, 4)).withColumn('month', df.date.substr(5, 2)).withColumn('day', df.date.substr(8, 2))
print(B)
C
df = df.withColumn('year', df.date.year()).withColumn('month', df.date.month()).withColumn('day', df.date.day())
print(C)
D
df = df.selectExpr('date', 'year(date) as year', 'month(date) as month', 'day(date) as day')
print(D)