
Ultimate access to all questions.
You are tasked with designing a cost-efficient data pipeline on Google Cloud that ingests JSON messages from Cloud Pub/Sub, processes them, and subsequently loads the transformed data into BigQuery. The solution must accommodate fluctuating input data volumes with minimal manual intervention. Which approach should you take to achieve this?
A
Use Cloud Dataproc to run your transformations. Monitor CPU utilization for the cluster. Resize the number of worker nodes in your cluster via the command line.
B
Use Cloud Dataproc to run your transformations. Use the diagnose command to generate an operational output archive. Locate the bottleneck and adjust cluster resources.
C
Use Cloud Dataflow to run your transformations. Monitor the job system lag with Stackdriver. Use the default autoscaling setting for worker instances.
D
Use Cloud Dataflow to run your transformations. Monitor the total execution time for a sampling of jobs. Configure the job to use non-default Compute Engine machine types when needed.