
Answer-first summary for fast verification
Answer: storesDF.groupBy(“division”, “storeCategory”).count()
The correct answer is determined by understanding how the `groupBy` method works in Spark. To count rows for each combination of columns, `groupBy` must receive the column names as separate strings. Option C correctly passes the column names as individual string arguments. Option A uses a `Seq` of `Column` objects, which is invalid syntax. Option B references undefined variables. Option D chains `groupBy` incorrectly, and Option E uses a `Seq` of strings, which isn't the correct parameter type.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Which of the following code blocks correctly counts the number of rows in DataFrame storesDF for each unique combination of values in the columns division and storeCategory?
A
storesDF.groupBy(Seq(col(“division”), col(“storeCategory”))).count()
B
storesDF.groupBy(division, storeCategory).count()
C
storesDF.groupBy(“division”, “storeCategory”).count()
D
storesDF.groupBy(“division”).groupBy(“StoreCategory”).count()
E
storesDF.groupBy(Seq(“division”, “storeCategory”)).count()