
Answer-first summary for fast verification
Answer: Use an Event-based trigger to start Spark jobs and configure the Databricks cluster to autoscale based on current workload.
An Event-based trigger allows for more dynamic and responsive scheduling of Spark jobs, aligning well with dynamic data inputs. Configuring the Databricks cluster to autoscale ensures efficient resource utilization and performance, adapting to varying data volumes without manual intervention.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Design a data pipeline in Azure Data Factory that schedules the execution of Spark jobs on an Azure Databricks cluster. The pipeline needs to handle dynamic data inputs and adjust the cluster size based on the volume of data. Describe how you would manage the scheduling of these Spark jobs and the considerations for scaling the Databricks cluster.
A
Schedule Spark jobs using a fixed-time schedule trigger and use a static-sized Databricks cluster.
B
Use an Event-based trigger to start Spark jobs and configure the Databricks cluster to autoscale based on current workload.
C
Manually trigger Spark jobs and manually adjust the Databricks cluster size as needed.
D
Schedule Spark jobs to run at off-peak hours to avoid scaling issues.