
Answer-first summary for fast verification
Answer: pandas_df = spark_df.toPandasAPI() grouped = pandas_df.groupby('category')
To convert a PySpark DataFrame to a Pandas on Spark DataFrame, you can use the 'toPandasAPI()' method, which is provided by the Pandas API on Spark. The correct code snippet would be 'pandas_df = spark_df.toPandasAPI()'. After converting the DataFrame, you can perform a groupby operation on the 'category' column using the 'groupby()' method, as shown in option C.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Given a PySpark DataFrame named 'spark_df', write a code snippet that converts it to a Pandas on Spark DataFrame and then performs a groupby operation on a column named 'category'.
A
pandas_df = spark_df.toPandas() grouped = pandas_df.groupby('category')
B
pandas_df = spark_df.toPandas().groupby('category')
C
pandas_df = spark_df.toPandasAPI() grouped = pandas_df.groupby('category')
D
pandas_df = spark_df.select('category').toPandas() grouped = pandas_df.groupby('category')
No comments yet.