
Ultimate access to all questions.
You are given a Spark DataFrame 'df' with a string column 'text'. Write a code snippet that tokenizes the 'text' column into individual words using the 'split' function, and explain the steps involved.
A
from pyspark.sql.functions import split result = df.select(split('text', ' ')) print(A)
B
from pyspark.sql.functions import explode result = df.select(explode(split('text', ' '))) print(B)
C
result = df.withColumn('words', split(df.text, ' ')) print(C)
D
result = df.withColumn('words', explode(split(df.text, ' '))) print(D)