Google Associate Cloud Engineer

Google Associate Cloud Engineer

Get started today

Ultimate access to all questions.


Your company's Data Science team is developing a Dataflow job on Google Cloud to process large volumes of unstructured data in various file formats using the ETL process. What is the best approach to make this data accessible for the Dataflow job?




Explanation:

Storing the data in Cloud Storage using the gcloud storage command is the most efficient method for handling large quantities of unstructured data in various file formats. Cloud Storage is optimized for storing and managing objects like files, offering high scalability, durability, and accessibility. This makes it ideal for processing by a Dataflow job. Other options like BigQuery, Cloud SQL, and Cloud Spanner are not suited for unstructured data or the specific requirements of a Dataflow job.