
Ultimate access to all questions.
In your role as a Machine Learning Engineer, you are responsible for a model that relies on data from a third-party broker. This broker occasionally changes the data format without prior notice, posing a risk to the model's performance. Your team is concerned about the potential impact on model accuracy and the additional maintenance overhead. Considering the need for a solution that is both cost-effective and scalable, and that minimizes manual intervention, which of the following approaches would you recommend to enhance the robustness of your model training pipeline against such issues? (Choose one correct option)
A
Implement TensorFlow Transform to automatically normalize data to the expected distribution, substituting non-conforming values with 0, ensuring the model receives data in a consistent format.
B
Develop custom TensorFlow functions to be executed at the beginning of each model training session, designed to identify and report known formatting discrepancies, requiring manual review for each anomaly detected.
C
Use tf.math to perform a detailed analysis of the data, calculating summary statistics and highlighting any statistical irregularities, which then must be manually addressed before model training can proceed.
D
Employ TensorFlow Data Validation (TFDV) to automatically uncover and report schema inconsistencies, enabling the pipeline to adjust or flag issues without manual intervention.