
Google Professional Machine Learning Engineer
Get started today
Ultimate access to all questions.
You are working on a machine learning model that relies on data from a third-party data broker. However, the data broker has a history of making unannounced formatting changes to the data, which could disrupt your model training pipeline. To mitigate this risk and make your pipeline more robust, what should you do?
You are working on a machine learning model that relies on data from a third-party data broker. However, the data broker has a history of making unannounced formatting changes to the data, which could disrupt your model training pipeline. To mitigate this risk and make your pipeline more robust, what should you do?
Explanation:
The correct answer is A. TensorFlow Data Validation (TFDV) is specifically designed to analyze training and serving data to detect schema anomalies and other data-related issues. It allows you to compute descriptive statistics, infer the data schema, and automatically identify any changes in the schema or data types that could impact your model. This makes it a robust choice for ensuring your model training pipeline remains consistent, even when working with data from sources with unpredictable formatting.