
Ultimate access to all questions.
In the development of a machine learning project aimed at predicting customer churn for a telecommunications company, the team is in the initial stages of setting up their data pipeline. The project requires the integration of data from multiple sources including CRM systems, call detail records, and customer feedback forms. The team is discussing the phases of data preparation and processing. Considering the need for scalability, cost-efficiency, and compliance with data privacy regulations, which phase is primarily responsible for the collection and acquisition of data from these diverse sources? Choose one correct option.
A
Data pre-processing, as it involves cleaning and preparing the data for analysis, ensuring compliance with data privacy regulations.
B
Data transformation, where data is reformatted and normalized for analysis, crucial for handling the diversity of data sources.
C
Model deployment, the phase where the trained model is put into production, unrelated to the initial data collection.
D
Data ingestion, marking the beginning of the data pipeline by gathering raw data from diverse sources, including extraction and loading into storage systems.