
Ultimate access to all questions.
You are tasked with training a machine learning model using Spark ML to predict customer churn. The dataset includes various features such as customer demographics, usage patterns, and historical churn data. Describe how you would approach the data splitting, model training, and evaluation stages. Additionally, discuss any specific considerations for this type of predictive modeling task and how you would implement them in Spark ML.
A
Use a simple random split and evaluate only on accuracy.
B
Use stratified sampling based on churn status and evaluate on precision, recall, and ROC-AUC.
C
Ignore historical churn data and focus only on demographics.
D
Train the model on the entire dataset without splitting.