Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.

In the context of developing a machine learning model, your team relies on a third-party data broker for training data. However, notifications about data formatting changes from the broker are unreliable, potentially compromising the robustness of your model training pipeline. Considering the need for a solution that ensures data integrity without significantly increasing operational costs, which of the following approaches would you implement? (Choose one correct option)

Real Exam

Implement custom TensorFlow functions at the pipeline's start to manually detect and flag known formatting errors, requiring continuous updates as new errors are identified.

0.0%

Integrate TensorFlow Transform to preprocess data, normalizing it to an expected distribution and replacing non-conforming values with zeros, which may mask underlying data issues.

Comments

Loading comments...

Adopt TensorFlow Data Validation to automatically detect and flag schema abnormalities, providing a scalable solution for ensuring data integrity against formatting changes.

100.0%