
Answer-first summary for fast verification
Answer: Dataprep: A serverless service designed for exploring, cleaning, and preparing both structured and unstructured data for machine learning without the need for coding, offering a user-friendly interface and scalability.
Dataprep is the ideal choice for this scenario as it is specifically designed for data preparation tasks, offering a no-code solution that is both cost-effective and scalable. It allows for easy exploration, cleaning, and preparation of data stored in various formats, making it suitable for machine learning model preparation. Other options either require coding, are not specifically designed for data preparation, or are not as user-friendly for the task at hand.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
As a junior Data Scientist in a consulting company, your first task involves preparing datasets for machine learning models. The datasets are stored in various file formats and require cleaning and correction. The solution must be cost-effective, scalable, and require minimal coding. Which Google Cloud Platform (GCP) service is most suitable for this task? Choose the best option.
A
Dataproc: A managed service for running Apache Spark and Apache Hadoop clusters, suitable for large-scale data processing but requires coding and is not specifically designed for data preparation.
B
BigQuery: A serverless, highly scalable data warehouse that requires SQL knowledge for data preprocessing, making it less straightforward for data cleaning tasks without additional tools.
C
Cloud Composer: A workflow orchestration service that manages workflows across clouds and on-premises data centers, not directly used for data preparation.
D
Dataprep: A serverless service designed for exploring, cleaning, and preparing both structured and unstructured data for machine learning without the need for coding, offering a user-friendly interface and scalability.