You are developing a modular data pipeline in Databricks, where you have separated reusable utility functions (such as data validation, logging, and custom transformations) into dedicated notebooks. Your main ETL notebook needs to leverage these utilities and ensure that any changes to the utility notebooks are automatically reflected in the ETL workflow without code duplication. Given the following requirements: The utility code should be executed in the same context as the main notebook, allowing variables and Spark sessions to be shared. The approach should support parameterization, so the utility notebook can accept arguments from the main notebook. The solution should be maintainable and scalable for a team of data engineers working collaboratively. Which of the following approaches best satisfies all requirements? Select the best answer and explain why the other options are less suitable. | Databricks Certified Data Engineer - Associate Quiz

You are developing a modular data pipeline in Databricks, where you have separated reusable utility functions (such as data validation, logging, and custom transformations) into dedicated notebooks. Your main ETL notebook needs to leverage these utilities and ensure that any changes to the utility notebooks are automatically reflected in the ETL workflow without code duplication.

Given the following requirements:

The utility code should be executed in the same context as the main notebook, allowing variables and Spark sessions to be shared.
The approach should support parameterization, so the utility notebook can accept arguments from the main notebook.
The solution should be maintainable and scalable for a team of data engineers working collaboratively.

Which of the following approaches best satisfies all requirements? Select the best answer and explain why the other options are less suitable.

Exam-Like

Powered ByGPT-5

Databricks Certified Data Engineer - Associate

Get started today

Comments