
Answer-first summary for fast verification
Answer: Use stratified sampling based on churn status and evaluate on precision, recall, and ROC-AUC.
Stratified sampling ensures that the training and testing subsets contain a representative sample of the churn status, which is crucial for evaluating the model's performance. Evaluating on precision, recall, and ROC-AUC provides a comprehensive assessment of the model's ability to predict churn.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with training a machine learning model using Spark ML to predict customer churn. The dataset includes various features such as customer demographics, usage patterns, and historical churn data. Describe how you would approach the data splitting, model training, and evaluation stages. Additionally, discuss any specific considerations for this type of predictive modeling task and how you would implement them in Spark ML.
A
Use a simple random split and evaluate only on accuracy.
B
Use stratified sampling based on churn status and evaluate on precision, recall, and ROC-AUC.
C
Ignore historical churn data and focus only on demographics.
D
Train the model on the entire dataset without splitting.
No comments yet.