
Ultimate access to all questions.
In the context of a machine learning workflow, a company is looking to implement a data pipeline to streamline their data processing tasks. The company deals with large volumes of data that require frequent updates and transformations before being used for model training. Given the need for automation, scalability, and efficiency, which of the following best describes the primary purpose of designing such a data pipeline? Choose the best option.
A
To simplify the process of model deployment by automating the transfer of data between storage and computing resources.
B
To automate the ingestion, transformation, and movement of data, ensuring that the data is processed efficiently and is ready for analysis and model training.
C
To significantly reduce data storage costs by compressing and archiving data that is not frequently accessed.
D
To provide a platform for data visualization, enabling data scientists to easily explore and understand the data before model training.
E
Both A and B are correct, as automating data processing also indirectly simplifies model deployment by ensuring data is readily available and properly formatted.