
Answer-first summary for fast verification
Answer: Use Cloud Dataprep and configure the BigQuery tables as the source. Schedule a daily job to clean the data.
The correct answer is D. Cloud Dataprep is specifically designed for data preparation and cleaning. It provides a graphical user interface (GUI) for cleaning and transforming data, which makes it suitable for automating daily data quality tasks. While other options could potentially accomplish the task, they are either more complex or costlier. Using a scheduled daily job in Cloud Dataprep is an efficient and cost-effective solution for cleaning data imported into BigQuery.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
TerramEarth, a company specializing in heavy equipment for mining and agriculture, has 20 million vehicles in operation, collecting 120 fields of data per second. The data is stored locally and is accessed during vehicle maintenance. Approximately 200,000 vehicles transmit data via a cellular network, contributing to about 9 TB/day. Their existing systems are based in a single US west coast data center, processing data with significant delay, leading to downtime for customers. The company has introduced a new architecture that writes all incoming data to BigQuery but has noticed that the data is dirty. How can you ensure data quality on an automated daily basis while managing costs?
A
Set up a streaming Cloud Dataflow job, receiving data by the ingestion process. Clean the data in a Cloud Dataflow pipeline.
B
Create a Cloud Function that reads data from BigQuery and cleans it. Trigger the Cloud Function from a Compute Engine instance.
C
Create a SQL statement on the data in BigQuery, and save it as a view. Run the view daily, and save the result to a new table.
D
Use Cloud Dataprep and configure the BigQuery tables as the source. Schedule a daily job to clean the data.