
Ultimate access to all questions.
NO.17 You are responsible for writing your company's ETL pipelines to run on an Apache Hadoop cluster. The pipeline will require some checkpointing and splitting pipelines. Which method should you use to write the pipelines?
Explanation:
Python using MapReduce is the correct choice for this scenario because:
Python provides the right balance of performance, flexibility, and development efficiency needed for ETL pipelines with checkpointing and splitting requirements on a Hadoop cluster.