
Explanation:
The key difference between cache() and persist() in Spark is that persist() allows you to specify different storage levels, whereas cache() uses a default storage level of MEMORY_AND_DISK. Both functions can be unpersisted using unpersist(), and neither is specifically designed for short-term or long-term persistence. The default storage level for both is MEMORY_AND_DISK, not MEMORY_ONLY.
Ultimate access to all questions.
What distinguishes cache() from persist() in Spark?
A
persist() allows specifying storage levels, unlike cache().
B
cache() saves DataFrames in memory only, while persist() also uses disk.
C
The default storage level for both cache() and persist() is MEMORY_ONLY.
D
DataFrames cached with cache() cannot be unpersisted, but those with persist() can.
E
Use cache() for short-term and persist() for long-term DataFrame persistence.
No comments yet.