Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
You are given a Spark DataFrame 'df' with a string column 'text'. Write a code snippet that tokenizes the 'text' column into individual words using the 'split' function, and explain the steps involved.
A
from pyspark.sql.functions import split
result = df.select(split('text', ' '))
print(A)
B
from pyspark.sql.functions import explode
result = df.select(explode(split('text', ' ')))
print(B)
C
result = df.withColumn('words', split(df.text, ' '))
print(C)
D
result = df.withColumn('words', explode(split(df.text, ' ')))
print(D)