Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Identify the error in the following code block intended to return the exact number of distinct values in the division column of DataFrame storesDF:
division
storesDF
Code block:
storesDF.agg(approx_count_distinct(col("division")).alias("divisionDistinct"))
A
The approx_count_distinct() operation needs a second argument to set the rsd parameter to ensure it returns the exact number of distinct values.
B
There is no alias() operation for the approx_count_distinct() operation's output.
C
There is no way to return an exact distinct number in Spark because the data Is distributed across partitions.
D
The approx_count_distinct()operation is not a standalone function - it should be used as a method from a Column object.
E
The approx_count_distinct() operation cannot determine an exact number of distinct values in a column.