
Answer-first summary for fast verification
Answer: Use a custom connector with Apache Beam to write a Dataflow pipeline that streams the data into BigQuery in Avro format.
The best option for efficiently importing the proprietary flight data into BigQuery with minimal resource consumption is **Option D** – Use a custom connector with Apache Beam to write a Dataflow pipeline that streams the data into BigQuery in Avro format. This approach is recommended because: - **Apache Beam** provides a flexible and powerful framework for building data pipelines. - The **Avro format** is optimized for streaming data, which helps in minimizing resource consumption. Other options considered: - **Option A** involves using Apache Hive with Dataproc, which might not be as efficient for real-time streaming. - **Option B** suggests using batch jobs triggered by a shell script, which could be less efficient and more resource-intensive for real-time data. - **Option C** involves storing raw data first and transforming it later, which may require more resources and is less efficient for real-time streaming.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A company in the aerospace industry has flight data stored in a proprietary format. You are tasked with importing this data into BigQuery efficiently, with minimal resource consumption. Which of the following options would you choose?
A
Write a Dataproc job using Apache Hive to stream the data into BigQuery in CSV format.
B
Use a shell script that triggers a Cloud Function for periodic ETL batch jobs on the new data source.
C
Use a standard Dataflow pipeline to store the raw data in BigQuery, then transform the format later when the data is used.
D
Use a custom connector with Apache Beam to write a Dataflow pipeline that streams the data into BigQuery in Avro format.
No comments yet.