Google Professional Machine Learning Engineer

Google Professional Machine Learning Engineer

Get started today

Ultimate access to all questions.


You are tasked with developing a Kubeflow pipeline on Google Kubernetes Engine (GKE) for a machine learning project. The initial step involves querying a large dataset stored in BigQuery, with the results serving as input for subsequent pipeline steps. The project has strict requirements for efficiency, scalability, and minimal manual intervention. Considering these constraints, which approach should you adopt to integrate BigQuery query execution into your Kubeflow pipeline? Choose the best option from the following:





Explanation:

The most efficient and straightforward approach to executing a BigQuery query within a Kubeflow pipeline is to use the pre-built BigQuery Query Component (Option D). This component is specifically designed for such tasks, offering ease of integration, reliability, and minimal setup. It abstracts the complexities of API interactions and data handling, allowing for a seamless workflow. However, for projects requiring additional data transformation or processing post-query, combining the BigQuery Query Component with a custom component (Option E) provides a balanced solution, offering both efficiency and flexibility. This approach adheres to the project's constraints by minimizing manual intervention while ensuring scalability and performance.