
Answer-first summary for fast verification
Answer: The streaming query joins every micro-batch against the snapshot of the static Delta table that was present at the moment the job was initialized.
In a **stream-static join**, the static table is materialized exactly once at the start of the streaming query. Every subsequent micro-batch joins incoming stream records against that same in-memory snapshot. Updates made to the underlying static table on storage are not picked up until the query is stopped and restarted. ### Why other options are incorrect: * **Checkpoints:** The checkpoint folder tracks the streaming **offsets** (identifying which data from the source has been processed), not changes or versions of the static table. * **Statefulness:** Stream-static joins are **stateless**. Spark does not store per-key state in the checkpoint; stateful tracking is only required for stream-stream joins. * **Support:** Joining a stream to a static Delta table is a standard and common design pattern, often used to enrich real-time event streams with slowly-changing dimensions or lookup data.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
When performing a join between a streaming DataFrame and a static Delta table in Databricks, which of the following statements is true regarding the behavior of the join?
A
The streaming query joins every micro-batch against the snapshot of the static Delta table that was present at the moment the job was initialized.
B
The checkpoint directory is responsible for tracking version changes and updates made to the static Delta table during the streaming job's execution.
C
Spark maintains state information for every unique key in the join within the checkpoint folder to support stateful stream-static joins.
D
Static Delta tables cannot be used in stream-static joins because Spark cannot guarantee exactly-once processing for the static side of the join.