
Answer-first summary for fast verification
Answer: Use the `save` method of the trained model to save the model in a specified directory in a distributed file system like HDFS or S3.
The correct approach to model deployment using Spark ML is to use the `save` method of the trained model to save the model in a specified directory in a distributed file system like HDFS or S3. This allows the model to be persisted and reused for inference. Option B is incorrect because `MLReader` is used for loading the saved model, not for saving it. Option C is incorrect because the `fit` and `transform` methods are used for training and making predictions on the training data, not for model persistence. Option D is incorrect because making predictions on new data without saving the model may require retraining the model, which can be time-consuming and inefficient.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In the context of model deployment using Spark ML, explain the process of saving and loading a trained machine learning model for inference. Provide a code snippet demonstrating the use of Spark ML's MLWriter for model persistence and explain the key considerations to keep in mind during this process.
A
Use the save method of the trained model to save the model in a specified directory in a distributed file system like HDFS or S3.
B
Use the load method of the MLReader class from the pyspark.ml module to load the saved model from the specified directory for inference.
C
Use the fit method of the Spark ML estimator to train the model and then use the transform method to make predictions on new data.
D
Use the predict method of the trained model to make predictions on new data without saving and loading the model.