
Ultimate access to all questions.
Your company is planning to implement a machine learning pipeline to support predictive analytics. Describe the steps you would take to design and implement an ETL pipeline for a machine learning project, and explain the considerations involved in preparing the data for machine learning models.
A
Use a single-stage ETL process to load all data into a machine learning platform and perform all transformations and analysis there, without considering data preparation for machine learning models.
B
Design a multi-stage ETL pipeline with data preprocessing and feature engineering steps to prepare the data for machine learning models, leveraging Apache Spark's machine learning libraries and frameworks.
C
Use a traditional statistical analysis approach to prepare the data for machine learning models, as it is more accurate than using an ETL pipeline.
D
Focus only on the ETL process and ignore the data preparation aspect, as it is not relevant to machine learning models.