Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


How can you dynamically scale Spark resources in a Spark Streaming application consuming data from Kafka, especially when the workload varies significantly throughout the day?




Explanation:

Option D is the most suitable for dynamically scaling Spark resources to match consumption needs in a scenario where Spark Streaming is consuming data from Kafka and the workload varies significantly throughout the day. Spark‘s dynamic allocation feature allows the Spark application to automatically adjust the number of executors based on the workload, ensuring optimal performance and resource utilization. Manual scaling (Option A) can be time-consuming and error-prone, while using Kafka‘s consumer group metrics (Option C) may not provide real-time scaling capabilities. Implementing a feedback loop with Spark‘s StreamingListener (Option B) can be complex and less efficient than Spark‘s built-in dynamic allocation feature.