
Explanation:
In a stream-static join, Spark materializes the static Delta table exactly once when the streaming query starts. Every subsequent micro-batch joins against this specific in-memory snapshot. Updates made to the static table on storage while the stream is running will not be visible to the streaming job until it is stopped and restarted.
Why other options are incorrect:
Ultimate access to all questions.
No comments yet.
When performing a join between a streaming DataFrame and a static Delta table in Databricks, which of the following statements accurately describes the behavior of the join operation?
A
The checkpoint directory maintains a record of updates to the static Delta table to ensure data consistency across micro-batches.
B
The join is treated as a stateful operation, and state information for specific join keys is persisted in the checkpoint folder.
C
Each micro-batch in the streaming query joins against the version of the static Delta table that was available at the time the streaming job was initialized.
D
Stream-static joins are unsupported in Spark Structured Streaming because Delta Lake cannot guarantee isolation levels for concurrent static reads.