
Ultimate access to all questions.
In the context of designing a machine learning pipeline for a Databricks project, consider the following scenario: Your organization requires a scalable, maintainable solution that adheres to best practices for data preprocessing, feature engineering, model training, and model deployment. The solution must also support experiment tracking, model management, and easy integration with other Databricks applications. Given these requirements, which of the following approaches would BEST meet the organization's needs? (Choose one option.)
A
Use a single Databricks notebook to perform all the steps of the machine learning pipeline, including data preprocessing, feature engineering, model training, and model deployment.
B
Design a multi-step machine learning pipeline using Databricks notebooks and MLflow for experiment tracking and model management, and deploy the trained model as a Databricks model service.
C
Leverage a third-party machine learning platform to handle data preprocessing, feature engineering, and model training, and use Databricks notebooks for data exploration and visualization.
D
Implement a custom solution using a combination of Databricks notebooks, Apache Spark MLlib, and a custom model serving framework for model deployment, and use a custom monitoring system to track the performance of the deployed model.