
Ultimate access to all questions.
How can a data engineering team efficiently make the Python library data_utils available to PySpark jobs across multiple notebooks on a Databricks cluster?
A
Run %pip install data_utils once on any notebook attached to the cluster.
B
Edit the cluster to use the Databricks Runtime for Data Engineering.
C
Set the PYTHONPATH variable in the cluster configuration to include the path to data_utils.
D
Add data_utils to the cluster's library dependencies using the spark.conf settings.
E
There is no way to make the data_utils library available to PySpark jobs on a cluster.