
Ultimate access to all questions.
In the context of preparing a dataset for a machine learning model aimed at predicting customer churn for a telecommunications company, data cleaning is a critical step. The dataset includes customer demographics, service usage, complaint history, and churn status. However, it contains missing values, duplicate records, and inconsistent entries in the complaint history. Considering the need for high accuracy in predictions to effectively reduce churn, which of the following best explains why data cleaning is indispensable? Choose the best option.
A
Data cleaning introduces variability in the dataset, making the model more robust to unseen data.
B
Data cleaning simplifies the dataset by removing unnecessary features, thus reducing computational costs.
C
Data cleaning ensures the dataset is free from errors and inconsistencies, which is crucial for training accurate models.
D
Data cleaning automates the feature selection process, eliminating the need for manual intervention.