
Answer-first summary for fast verification
Answer: 1. storesDF 2. persist 3. StorageLevel.MEMORY_ONLY
The goal is to store the DataFrame storesDF only in Spark’s memory. In Spark versions before 3.0, calling cache() would default to memory-only storage (MEMORY_ONLY). However, starting with Spark 3.0, the default for cache() changed to MEMORY_AND_DISK. This means if the DataFrame doesn’t fit in memory, Spark will spill the excess data to disk. That behavior no longer meets the “memory-only” requirement. Because of this change, the correct answer is E: storesDF.persist(StorageLevel.MEMORY_ONLY).count() This explicitly tells Spark to store the data only in memory and never spill to disk. Option C (storesDF.cache().count()) is not correct in Spark 3.0+ because it allows disk usage, and the other options either have invalid syntax or don’t specify memory-only storage.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
The following code block should cache the DataFrame storesDF exclusively in Spark's memory. Select the option that accurately fills in the numbered blanks within the code block to accomplish this task.
Code block:
__1__.__2__(__3__).count()
A
B
C
D
E
