
Ultimate access to all questions.
In the context of preparing data for Machine Learning models, a team is faced with the challenge of integrating diverse datasets from multiple sources. The datasets vary in format, contain missing values, and have inconsistent scales. The team's goal is to ensure the data is optimally prepared for analysis and model training, considering constraints such as time, cost, and the need for scalability. Which of the following steps are MOST critical to achieve this goal? (Choose two options if E is available, otherwise choose one.)
A
Developing ML models without any data preprocessing to save time and costs.
B
Ensuring data is cleaned by removing errors, inconsistencies, and filling missing values.
C
Increasing the complexity of the data by adding more features without analysis.
D
Transforming the data into a suitable format for analysis, such as normalizing numerical data and encoding categorical variables.
E
Both B and D are necessary steps for effective data preparation.