
Answer-first summary for fast verification
Answer: `persist()` allows specifying storage levels, unlike `cache()`.
The key difference between `cache()` and `persist()` in Spark is that `persist()` allows you to specify different storage levels, whereas `cache()` uses a default storage level of `MEMORY_AND_DISK`. Both functions can be unpersisted using `unpersist()`, and neither is specifically designed for short-term or long-term persistence. The default storage level for both is `MEMORY_AND_DISK`, not `MEMORY_ONLY`.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
What distinguishes cache() from persist() in Spark?
A
persist() allows specifying storage levels, unlike cache().
B
cache() saves DataFrames in memory only, while persist() also uses disk.
C
The default storage level for both cache() and persist() is MEMORY_ONLY.
D
DataFrames cached with cache() cannot be unpersisted, but those with persist() can.
E
Use cache() for short-term and persist() for long-term DataFrame persistence.