Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
How can a data engineering team efficiently make the Python library data_utils available to PySpark jobs across multiple notebooks on a Databricks cluster?
data_utils
A
Run %pip install data_utils once on any notebook attached to the cluster.
%pip install data_utils
B
Edit the cluster to use the Databricks Runtime for Data Engineering.
C
Set the PYTHONPATH variable in the cluster configuration to include the path to data_utils.
PYTHONPATH
D
Add data_utils to the cluster's library dependencies using the spark.conf settings.
spark.conf
E
There is no way to make the data_utils library available to PySpark jobs on a cluster.