Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

In a scenario where a Spark application is experiencing performance issues on a Databricks cluster, you are tasked with identifying and resolving the bottlenecks. The application processes large datasets and must comply with strict SLAs regarding execution time. Considering the need for cost efficiency and scalability, which of the following steps would you prioritize in your analysis using the Spark UI to accurately identify the root cause of the performance issues? Choose the best option.

Simulated

Focus solely on the 'Jobs' tab to review the number of jobs submitted, without analyzing the stages or tasks for any anomalies.

5.5%

Analyze the 'Stages' tab to identify stages with unusually high execution times or failures, and then examine the 'Tasks' tab within those stages to look for skewed task execution or outliers.

Comments

Loading comments...