
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
NO.17 You are responsible for writing your company's ETL pipelines to run on an Apache Hadoop cluster. The pipeline will require some checkpointing and splitting pipelines. Which method should you use to write the pipelines?
A
PigLatin using Pig
B
HiveQL using Hive
C
Java using MapReduce
D
Python using MapReduce
Explanation:
Python using MapReduce is the correct choice for this scenario because:
Python provides the right balance of performance, flexibility, and development efficiency needed for ETL pipelines with checkpointing and splitting requirements on a Hadoop cluster.