
Answer-first summary for fast verification
Answer: Data ingestion, the process of bringing raw data into the system from various sources.
**Correct Option:** C. Data ingestion: This is correct because data ingestion is the phase where data is collected and acquired from various sources, addressing the challenges of data volume, velocity, and variety. It involves bringing raw data into the system, whether through batch processing, real-time streaming, or other methods. This step is crucial for gathering the necessary data that will be used in subsequent processing and analysis stages. **Incorrect Options:** A. Data transformation: This is incorrect because data transformation is the process of converting data into a suitable format for analysis, which happens after the data has been ingested and pre-processed. It involves tasks such as normalization, scaling, and encoding. B. Data modeling: This is incorrect because data modeling involves creating a mathematical or computational model based on the prepared data. This phase occurs after data ingestion, pre-processing, and transformation, and it focuses on building and training machine learning models. D. Data pre-processing: This is incorrect because data pre-processing involves cleaning, transforming, and preparing the data for analysis after it has been ingested. It does not include the initial collection and acquisition of data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In the context of designing a scalable and efficient data system for a machine learning project, during which phase is data typically collected and acquired from various sources, considering factors such as data volume, velocity, and variety? Choose the best option.
A
Data transformation, where data is converted into a suitable format for analysis.
B
Data modeling, which involves creating mathematical models based on the prepared data.
C
Data ingestion, the process of bringing raw data into the system from various sources.
D
Data pre-processing, which includes cleaning and preparing the data for analysis.
No comments yet.