
Answer-first summary for fast verification
Answer: Employ Cloud Dataflow to convert text files into compressed Avro format, store them in Cloud Storage, and use BigQuery permanent linked tables for querying.
The optimal strategy for handling very large text files in a Google Cloud data pipeline, ensuring ANSI SQL query support, compression, and parallel loading, is Option B. This approach leverages Cloud Dataflow for conversion to Avro format, Cloud Storage for scalable storage, and BigQuery's linked tables for efficient querying. Option A is less ideal due to potential higher costs and scalability issues with BigQuery storage. Option C is not recommended because gzip is not optimized for BigQuery queries. Option D is unsuitable as Cloud Bigtable, being a NoSQL database, does not support ANSI SQL queries effectively.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When designing storage for very large text files in a Google Cloud data pipeline that requires ANSI SQL query support, compression, and parallel loading, which approach aligns with Google's best practices?
A
Utilize Cloud Dataflow to transform text files into compressed Avro format and store them in BigQuery for querying.
B
Employ Cloud Dataflow to convert text files into compressed Avro format, store them in Cloud Storage, and use BigQuery permanent linked tables for querying.
C
Compress text files to gzip format using Grid Computing Tools and store them in BigQuery for querying.
D
Compress text files to gzip format using Grid Computing Tools, store them in Cloud Storage, and then import into Cloud Bigtable for querying.
No comments yet.