Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
You have a DataFrame df with columns order_id, customer_id, and product_id. How would you validate that order_id is unique across all rows using Spark? Provide the code snippet.
df
order_id
customer_id
product_id
A
df.groupBy('order_id').count().filter('count > 1')
B
df.select('order_id').distinct()
C
df.groupBy('order_id').agg(count('order_id') > 1)
D
df.groupBy('order_id').count().filter('count = 1')