
Answer-first summary for fast verification
Answer: When it’s faster to read all the computed data in DataFrame df that cannot fit into memory from disk rather than recompute it based on its logical plan.
The MEMORY_AND_DISK storage level is most advantageous when it's faster to read the computed data in DataFrame df that cannot fit into memory from disk rather than recompute it based on its logical plan. This is because storing data on disk avoids the need to recompute it, saving time and resources when the data is needed again. Options A, B, and C describe scenarios where MEMORY_ONLY would be sufficient or more efficient, and option E is incorrect because MEMORY_ONLY is not always more advantageous, especially when dealing with large datasets that cannot fit entirely in memory.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When is it most beneficial to persist a DataFrame df using the MEMORY_AND_DISK storage level instead of MEMORY_ONLY?
A
When all of the computed data in DataFrame df can fit into memory.
B
When the memory is full and it’s faster to recompute all the data in DataFrame df rather than read it from disk.
C
When it’s faster to recompute all the data in DataFrame df that cannot fit into memory based on its logical plan rather than read it from disk.
D
When it’s faster to read all the computed data in DataFrame df that cannot fit into memory from disk rather than recompute it based on its logical plan.
E
The storage level MENORY_ONLY will always be more advantageous because it’s faster to read data from memory than it is to read data from disk.
No comments yet.