
Ultimate access to all questions.
In the context of preparing data for machine learning, a team is tasked with ensuring that the data is well-organized, relationships among data entities are clearly defined, and the data is primed for analysis. This involves creating entity-relationship diagrams, designing data schemas, and applying data normalization techniques. Which phase of the machine learning data preparation process is primarily responsible for these tasks? Choose the best option.
A
Data collection, where raw data is gathered from various sources without any structuring.
B
Data pre-processing, which involves cleaning and transforming data to improve its quality and usability.
C
Data integration, focusing on combining data from multiple sources into a unified dataset.
D
Data modeling, dedicated to establishing the data schema, structures, and relationships to ensure the data is well-organized and ready for analysis.