
Answer-first summary for fast verification
Answer: Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
**Correct Answer: C** - **Cloud Data Fusion** is a fully managed, cloud-native data integration service designed for building and managing ETL/ELT data pipelines efficiently. It offers a visual interface for pipeline construction, eliminating the need for coding. - **Wrangler**, a feature within Cloud Data Fusion, provides a no-code environment for visually transforming, cleaning, and preparing your data. - By leveraging Cloud Data Fusion and Wrangler, you can easily standardize telephone formats and country code identifiers without writing any code. Additionally, you can configure the service to automatically execute these normalization tasks on a regular schedule. **Why the other options are incorrect:** - **A:** Submitting a Spark job to Dataproc Serverless requires manual coding and cluster management, which contradicts the requirement for a quick, no-code solution. - **B:** While BigQuery is powerful for querying and analyzing data, it's not the most efficient tool for data transformation tasks like normalizing formats. - **D:** Dataflow SQL is better suited for complex data processing tasks and may necessitate additional manual intervention and coding for setting up recurring jobs, making it less ideal for this scenario.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your company uses BigQuery to generate weekly executive reports, but you've noticed inconsistencies in data fields, such as varying telephone formats and country code identifiers. What's the quickest, no-code solution to normalize this data on a recurring basis?
A
Create a Spark job and submit it to Dataproc Serverless.
B
Use BigQuery and GoogleSQL to normalize the data, and schedule recurring queries in BigQuery.
C
Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
D
Use Dataflow SQL to create a job that normalizes the data, and after the first run, schedule the pipeline to execute recurrently.
No comments yet.