
Explanation:
Databricks Jobs provide failure isolation, not cross-task atomicity. Each task in a workflow is managed as an independent unit of work. When Task A and Task B succeed, their changes are committed to the Lakehouse immediately according to the ACID properties of Delta Lake for each specific write. If Task C fails, only the operations within Task C are affected.
While Delta Lake ensures atomicity within a single write operation (transaction), there is no built-in mechanism to automatically roll back multiple successful tasks in a workflow just because a downstream or parallel task fails. Therefore, the work of A and B remains intact, and Task C might be left in a partially completed state depending on how many individual transactions it contained before the failure.
Ultimate access to all questions.
In a Databricks Workflow, a job is configured with three tasks: Task A runs first. Upon its completion, Tasks B and C are triggered to run in parallel. If Tasks A and B finish successfully but Task C fails, which of the following describes the state of the Lakehouse?
A
The changes made by Tasks A and B are persisted in the Lakehouse, while Task C's failure does not trigger a rollback of those successful tasks.
B
No changes will be saved to the Lakehouse; a failure in any task (like Task C) triggers an automatic rollback of the entire job's progress to ensure consistency.
C
Databricks uses a global commit barrier, meaning changes from Tasks A and B are only finalized and visible once Task C also succeeds.
D
The results of Tasks A and B are saved, but Databricks will automatically identify and undo every single write operation performed by Task C before the failure occurred.
No comments yet.