
Ultimate access to all questions.
Which of the following modifications would provide a more precise assessment of how the code will perform in a production environment for a Databricks user troubleshooting pipeline execution times? The user is currently testing transformations interactively by running cells multiple times with display() calls to verify correctness.
A
The Jobs UI should be leveraged to occasionally run the notebook as a job and track execution time during incremental code development because Photon can only be enabled on clusters launched for scheduled jobs.
B
The only way to meaningfully troubleshoot code execution times in development notebooks is to use production-sized data and production-sized clusters with Run All execution.
C
Production code development should only be done using an IDE; executing code against a local build of open source Spark and Delta Lake will provide the most accurate benchmarks for how code will perform in production.
D
Calling display() forces a job to trigger, while many transformations will only add to the logical query plan; because of caching, repeated execution of the same logic does not provide meaningful results.