Ultimate access to all questions.
In the context of optimizing Spark performance for a large-scale machine learning project, which technique is used to store intermediate data in memory, thereby speeding up iterative algorithms and reducing disk I/O?
Explanation:
In Spark, leveraging in-memory computation is a key feature for optimizing performance. This technique involves caching or persisting intermediate data in memory, which is especially beneficial for iterative machine learning algorithms. By storing data in memory, Spark minimizes the need for disk I/O, leading to faster execution times for tasks that require multiple iterations over the same dataset.