
Answer-first summary for fast verification
Answer: We can use the mean rather than median of the observations to replace missing observations.
## Explanation Option B is false because when replacing missing observations, using the median is generally preferred over the mean, especially when dealing with skewed distributions or outliers. The mean is sensitive to extreme values, while the median is more robust and provides a better central tendency measure for imputation. **Why the other options are correct:** - **A**: True - Outliers (observations several standard deviations from the mean) can significantly impact statistical results and should be carefully examined. - **C**: True - Irrelevant observations should be removed to improve model performance and reduce noise. - **D**: True - Consistent data formatting is essential for proper data reading and analysis.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In terms of the reasons for data cleaning, Which of the following is false?
A
Observations on a feature that are several standard deviations from the mean should be checked carefully, as they can have a big effect on results.
B
We can use the mean rather than median of the observations to replace missing observations.
C
Observations not relevant to the task at hand should be removed.
D
For data to be read correctly, it is important that all data is recorded in the same way.