
Explanation:
The correct answer is C because it correctly uses the get_json_object function to extract the user_id field from the JSON strings in the data column and adds it as a new column extracted_user_id.
Ultimate access to all questions.
Consider a DataFrame df with a column data containing JSON strings. How would you extract the field user_id from these JSON strings into a new column named extracted_user_id using Spark? Provide the code snippet.
A
df.withColumn('extracted_user_id', from_json(col('data'), 'user_id'))
B
df.select(from_json(col('data'), 'user_id').alias('extracted_user_id'))
C
df.withColumn('extracted_user_id', get_json_object(col('data'), '$.user_id'))
D
df.select(get_json_object(col('data'), '$.user_id').alias('extracted_user_id'))
No comments yet.