Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
When running a large Spark job that occasionally fails due to OutOfMemory errors, which technique can most effectively manage memory usage?
A
Using persist(StorageLevel.DISK_ONLY) for intermediate DataFrames.
B
Decreasing spark.executor.memory to force garbage collection to run more frequently.
C
Increasing the number of partitions to distribute data more evenly across executors.
D
Allocating more memory to the driver than executors to manage task distribution.