
Ultimate access to all questions.
You have a DataFrame df with columns order_id, customer_id, and product_id. How would you validate that order_id is unique across all rows using Spark? Provide the code snippet.
A
df.groupBy('order_id').count().filter('count > 1')_
B
df.select('order_id').distinct()_
C
df.groupBy('order_id').agg(count('order_id') > 1)
D
df.groupBy('order_id').count().filter('count = 1')_