
Answer-first summary for fast verification
Answer: Run the TFX pipeline in Vertex AI Pipelines. Set the appropriate Apache Beam parameters in the pipeline to run the data preprocessing steps in Dataflow.
Option B is correct because it uses Vertex AI Pipelines as the orchestrator, which provides native integration with Vertex AI Experiments and Vertex ML Metadata for tracking metrics, parameters, and artifacts. By setting Apache Beam parameters to run data preprocessing steps in Dataflow, it efficiently scales for processing 100 TB of data from BigQuery, leveraging Dataflow's distributed processing capabilities. Option A is less suitable because Vertex AI Training jobs are optimized for model training, not data preprocessing. Options C and D are suboptimal as they use Dataproc or Dataflow directly as orchestrators, which lack the seamless integration with Vertex AI services that Vertex AI Pipelines offers, making metadata tracking and experiment management more complex.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are developing a TensorFlow Extended (TFX) pipeline with standard components that includes data preprocessing. The pipeline will be deployed to production and must process up to 100 TB of data from BigQuery. To ensure the data preprocessing steps scale efficiently, publish metrics and parameters to Vertex AI Experiments, and track artifacts using Vertex ML Metadata, how should you configure the pipeline run?
A
Run the TFX pipeline in Vertex AI Pipelines. Configure the pipeline to use Vertex AI Training jobs with distributed processing.
B
Run the TFX pipeline in Vertex AI Pipelines. Set the appropriate Apache Beam parameters in the pipeline to run the data preprocessing steps in Dataflow.
C
Run the TFX pipeline in Dataproc by using the Apache Beam TFX orchestrator. Set the appropriate Vertex AI permissions in the job to publish metadata in Vertex AI.
D
Run the TFX pipeline in Dataflow by using the Apache Beam TFX orchestrator. Set the appropriate Vertex AI permissions in the job to publish metadata in Vertex AI.