Google Professional Data Engineer

Get started today

Ultimate access to all questions.

Explanation:

The correct answer is B. Apache Spark is faster than Hadoop MapReduce because it performs in-memory processing and lazy evaluation, which optimizes job processing time. Although there might be some initial overhead in rewriting the job, Spark's performance benefits make it the preferred choice for handling increased data volumes without a proportional increase in costs.

Explanation:

Comments (0)

No comments yet.

Your company, which has experienced rapid growth, is now ingesting data at a much higher rate than before. You are responsible for managing the daily batch MapReduce analytics jobs in an Apache Hadoop environment. Due to the recent surge in data, these batch jobs are now lagging behind schedule. You have been tasked with recommending ways for the development team to enhance the responsiveness of the analytics processes without incurring additional costs. What advice should you provide to them?

Exam-Like

Rewrite the job in Pig.

16.2%

Rewrite the job in Apache Spark.

59.9%

Increase the size of the Hadoop cluster.

17.6%

Decrease the size of the Hadoop cluster but also rewrite the job in Hive.

6.3%