
Explanation:
A potential downside of using Pandas API on Spark instead of PySpark is the increased computation time due to internal frame conversion. This is because the Pandas API on Spark necessitates converting data between the native Spark DataFrame format and the Pandas DataFrame format, introducing additional overhead to the computation process.
Ultimate access to all questions.
No comments yet.
What could be a potential drawback of opting for the Pandas API on Spark over PySpark?
A
Limited functionality compared to PySpark
B
Inefficient data structure
C
Increased computation time due to internal frame conversion
D
Limited support for distributed computing
E
None of the above