
Answer-first summary for fast verification
Answer: %sh executes shell code exclusively on the local driver machine, leading to significant performance overhead.
The %sh magic command in Databricks is designed to run shell code within notebooks. However, it operates solely on the Apache Spark driver and not on the worker nodes, which can lead to performance issues. For more details, refer to the [Databricks documentation](https://docs.databricks.com/notebooks/notebooks-code.html#mix-languages).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A junior data engineer is utilizing the %sh magic command to execute some legacy code. A senior data engineer suggests refactoring the code instead. What could be the reason for avoiding the use of the %sh magic command?
A
%sh executes shell code exclusively on the local driver machine, leading to significant performance overhead.
B
None of these reasons accurately describe why %sh might need to be avoided.
C
%sh restarts the Python interpreter, clearing all variables declared in the notebook.
D
%sh cannot access storage to persist the output.
E
All the above reasons explain why %sh may need to be avoided.
No comments yet.