Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
To ensure your Spark jobs on Databricks are both cost-effective and scalable while managing varying ETL workloads throughout the day, which strategy would you adopt for dynamically scaling resources based on workload demands?
A
Utilizing spot instances for Spark clusters whenever possible, with a fallback strategy to on-demand instances for critical jobs.
B
Implementing a custom monitoring solution that adjusts clusters based on CPU and memory utilization, integrating with Azure Cost Management for budget tracking.
C
Leveraging Databricks‘ autoscaling feature to automatically adjust the number of nodes in a cluster based on the workload, with cost constraints defined in the job configuration.
D
Scheduling Spark jobs during off-peak hours using Databricks jobs scheduler to capitalize on lower compute costs and reduced contention.