Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In the context of building a scalable and efficient data pipeline for a machine learning project, which of the following practices are essential to ensure high data quality throughout the pipeline? Choose two correct options.
A
Exclusively utilizing batch processing to avoid the complexities of streaming data
B
Implementing comprehensive data validation and cleansing procedures at each stage of the pipeline
C
Omitting data validation to expedite the pipeline processing time
D
Minimizing transformation steps to reduce computational overhead without considering data quality impacts
E
Incorporating both batch and streaming data processing with tailored validation mechanisms for each