
Answer-first summary for fast verification
Answer: It introduces information from the validation or test set into the training set, leading to overly optimistic performance estimates.
Data leakage occurs when information from outside the training dataset is used to create the model. This can lead to overly optimistic performance estimates because the model has access to information it shouldn't have during training.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
What is the main issue with data leakage in machine learning?
A
It can cause underfitting of the model.
B
It increases the computational complexity of the algorithm.
C
It leads to overfitting of the model.
D
It introduces information from the validation or test set into the training set, leading to overly optimistic performance estimates.
No comments yet.