
Ultimate access to all questions.
TerramEarth's 20 million vehicles, distributed globally, collect telemetry data which is stored in regional Google Cloud Storage (GCS) buckets based on their location (US, Europe, or Asia). The company currently stores and processes data to provide insights, but the CTO now requires an analysis to determine why vehicles are breaking down after 100K miles. Given that the data is already divided into regional buckets, what is the most cost-effective way to run this analysis job on all the raw telemetry data?
A
Move all the data into 1 zone, then launch a Cloud Dataproc cluster to run the job
B
Move all the data into 1 region, then launch a Google Cloud Dataproc cluster to run the job
C
Launch a cluster in each region to preprocess and compress the raw data, then move the data into a multi-region bucket and use a Dataproc cluster to finish the job
D
Launch a cluster in each region to preprocess and compress the raw data, then move the data into a region bucket and use a Cloud Dataproc cluster to finish the job