
Answer-first summary for fast verification
Answer: Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Dataproc, BigQuery, and Compute Engine.
The correct answer is D. Storing data in a multi-regional Cloud Storage bucket maximizes availability, which is an important condition given in the question. Performance is not a factor here, so using multi-regional storage ensures data durability and availability across regions. Options A, B, and C do not meet all the conditions as effectively. For example, option A uses HDFS which is less suitable for multi-format data storage needs and avoids high availability configurations. Option B is less suited for non-structured data like PDFs. Option C does not offer the same level of availability as multi-regional storage.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are tasked with creating a cloud-native historical data processing system with the following requirements in mind:
How would you structure the data storage to meet these needs?
A
Create a Dataproc cluster with high availability. Store the data in HDFS, and perform analysis as needed.
B
Store the data in BigQuery. Access the data using the BigQuery Connector on Dataproc and Compute Engine.
C
Store the data in a regional Cloud Storage bucket. Access the bucket directly using Dataproc, BigQuery, and Compute Engine.
D
Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Dataproc, BigQuery, and Compute Engine.
No comments yet.