
Answer-first summary for fast verification
Answer: Each micro-batch in the streaming query joins against the version of the static Delta table that was available at the time the streaming job was initialized.
In a stream-static join, Spark materializes the static Delta table exactly once when the streaming query starts. Every subsequent micro-batch joins against this specific in-memory snapshot. Updates made to the static table on storage while the stream is running will not be visible to the streaming job until it is stopped and restarted. **Why other options are incorrect:** * **Checkpointing:** Checkpoints track streaming offsets (the progress of the stream) and state for stateful operations; they do not track changes to static tables. * **Statefulness:** Stream-static joins are **stateless**. Spark does not need to maintain a state store for keys as it does in stream-stream joins. * **Support:** Stream-static joins are a primary use case for Delta Lake, commonly used to enrich real-time event streams with slowly-changing dimensions or metadata.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
When performing a join between a streaming DataFrame and a static Delta table in Databricks, which of the following statements accurately describes the behavior of the join operation?
A
The checkpoint directory maintains a record of updates to the static Delta table to ensure data consistency across micro-batches.
B
The join is treated as a stateful operation, and state information for specific join keys is persisted in the checkpoint folder.
C
Each micro-batch in the streaming query joins against the version of the static Delta table that was available at the time the streaming job was initialized.
D
Stream-static joins are unsupported in Spark Structured Streaming because Delta Lake cannot guarantee isolation levels for concurrent static reads.