
Answer-first summary for fast verification
Answer: BigQuery, Dataflow, TensorFlow
Google recommends BigQuery for its efficiency in performing both data and feature engineering operations using standard SQL, making it ideal for correcting, dividing, aggregating data, and processing features by merging, normalizing, and categorizing them. Dataflow is suggested for advanced data transformations, such as window-aggregation feature transformations in streaming mode, especially useful for large-scale, unstructured data. TensorFlow (tf.transform) is also recommended for creating new features like crossed_column, embedding_column, and bucketized_column, integrating these transformations into the model's graph upon SavedModel creation. While Cloud Composer serves as a workflow tool and Cloud Dataproc and Cloud Storage have their uses in data processing and storage, they are not primarily recommended for automatic data transformation procedures before model training. For more details, refer to Google's documentation on preparing data and managing datasets in Vertex AI.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
As a junior Data Scientist in a consulting company, you're involved in multiple ML projects across various industries. Your current project involves developing a predictive model for a retail client to forecast inventory demands. The client's data is vast, unstructured, and stored across multiple sources. Your task includes efficiently collecting, cleaning, and transforming this data into a structured format suitable for model training. Given the project's tight deadline and the need for scalability, you're evaluating Google-recommended services that offer automatic, scalable procedures for data transformation prior to training. Which three services does Google recommend for this purpose, considering the need for handling large-scale, unstructured data efficiently? (Choose three)
A
BigQuery
B
Cloud Composer
C
Dataflow
D
TensorFlow
E
Cloud Dataproc
F
Cloud Storage