
Ultimate access to all questions.
As a junior Data Scientist working on a classification model with TensorFlow, you're faced with a limited dataset. You're familiar with the standard practice of dividing data into training, test, and validation sets but are concerned about the adequacy of your dataset size for satisfactory model performance. Given the constraints of a small dataset, which of the following approaches would best ensure model performance without overfitting? Choose the best option.
A
Split the dataset into training and test sets only, risking potential overfitting due to the limited size of the dataset.
B
Employ cross-validation techniques to maximize the utility of the limited dataset, thereby reducing the risk of overfitting and providing a more reliable estimate of model performance.
C
Use the entire dataset for training, which directly leads to overfitting as the model learns the noise in the training data.
D
Split the dataset into training, test, and validation sets, but this may not fully leverage the available data for model validation and testing due to the small size of the dataset.