
Answer-first summary for fast verification
Answer: df.withColumn('extracted_user_id', get_json_object(col('data'), '$.user_id'))
The correct answer is C because it correctly uses the `get_json_object` function to extract the `user_id` field from the JSON strings in the `data` column and adds it as a new column `extracted_user_id`.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Consider a DataFrame df with a column data containing JSON strings. How would you extract the field user_id from these JSON strings into a new column named extracted_user_id using Spark? Provide the code snippet.
A
df.withColumn('extracted_user_id', from_json(col('data'), 'user_id'))
B
df.select(from_json(col('data'), 'user_id').alias('extracted_user_id'))
C
df.withColumn('extracted_user_id', get_json_object(col('data'), '$.user_id'))
D
df.select(get_json_object(col('data'), '$.user_id').alias('extracted_user_id'))
No comments yet.