Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a Spark MLlib implementation, you are tasked with optimizing the performance of a linear regression model on a large dataset. Which of the following strategies would be most effective in improving the scalability and efficiency of the model?
A
Increase the number of iterations of the gradient descent algorithm to ensure convergence.
B
Use a distributed version of the stochastic gradient descent algorithm to parallelize the computation.
C
Reduce the number of features in the dataset to simplify the model and reduce the computational complexity.
D
Increase the batch size of the gradient descent algorithm to speed up the convergence.