
Explanation:
The correct approach to model deployment using Spark ML is to use the save method of the trained model to save the model in a specified directory in a distributed file system like HDFS or S3. This allows the model to be persisted and reused for inference. Option B is incorrect because MLReader is used for loading the saved model, not for saving it. Option C is incorrect because the fit and transform methods are used for training and making predictions on the training data, not for model persistence. Option D is incorrect because making predictions on new data without saving the model may require retraining the model, which can be time-consuming and inefficient.
Ultimate access to all questions.
In the context of model deployment using Spark ML, explain the process of saving and loading a trained machine learning model for inference. Provide a code snippet demonstrating the use of Spark ML's MLWriter for model persistence and explain the key considerations to keep in mind during this process.
A
Use the save method of the trained model to save the model in a specified directory in a distributed file system like HDFS or S3.
B
Use the load method of the MLReader class from the pyspark.ml module to load the saved model from the specified directory for inference.
C
Use the fit method of the Spark ML estimator to train the model and then use the transform method to make predictions on new data.
D
Use the predict method of the trained model to make predictions on new data without saving and loading the model.
No comments yet.