Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


Which metric in Ganglia would indicate optimal resource utilization for a cluster with 3 executor nodes?




Explanation:

Proper utilization of VM resources in a cluster implies efficient use without over-saturation. A consistent Five Minute Load Average (Option A) could indicate stability but doesn't specify if the load is optimal. CPU Utilization around 75% (Option B) suggests active use of resources while leaving headroom for spikes, aligning with best practices for balancing efficiency and avoiding bottlenecks. Network I/O never spiking (Option C) might indicate underutilization, as spikes are normal during data processing. Total Disk Space remaining constant (Option D) reflects storage capacity, not utilization during compute operations. Therefore, the best indicator of proper resource utilization is CPU Utilization around 75% (Option B), as it directly reflects effective CPU usage without overcommitment.