
Answer-first summary for fast verification
Answer: Use the Databricks Runtime for Machine Learning with autoscaling enabled, and install the distributed processing library using the Library management system at the cluster scope.
Option B is the most efficient approach as it leverages the autoscaling feature of the Databricks Runtime for Machine Learning to handle varying computational loads and uses the Library management system to ensure the distributed processing library is available across the entire cluster, eliminating the need for manual installations.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In a scenario where you are optimizing a Databricks environment for a machine learning pipeline, you need to ensure that the Databricks Runtime for Machine Learning is configured to handle the specific computational requirements of your models. Additionally, you must install a Python library that supports distributed processing. Which of the following steps should you take to achieve this?
A
Create a cluster with the Databricks Runtime for Machine Learning, and manually install the distributed processing library in each notebook.
B
Use the Databricks Runtime for Machine Learning with autoscaling enabled, and install the distributed processing library using the Library management system at the cluster scope.
C
Create a custom Databricks Runtime for Machine Learning that includes the distributed processing library in its base image.
D
Instruct team members to install the distributed processing library in their local development environments and synchronize the installations across all notebooks.