
Answer-first summary for fast verification
Answer: df.filter(col('email').contains('@example.com'))
The correct answer is A because it filters the DataFrame to show only those rows where the `email` column contains '@example.com', which helps in identifying if any such instances exist.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are working with a DataFrame df that contains a column email which should not contain any instances of '@example.com'. How would you validate this using Spark? Provide the code snippet.
A
df.filter(col('email').contains('@example.com'))
B
df.filter(!col('email').contains('@example.com'))
C
df.select('email').distinct().filter(col('email').contains('@example.com'))
D
df.select('email').filter(!col('email').contains('@example.com'))
No comments yet.