
Ultimate access to all questions.
A data engineer is configuring two independent Structured Streaming jobs. Both jobs consume data from different Kafka topics but write to the same Delta Lake bronze table using an identical schema. The proposed directory structure is as follows:
./bronze/_checkpoint (shared)./bronze/_delta_log./bronze/year_week=2023.02Can both streaming queries safely share the single ./bronze/_checkpoint folder? Why or why not?_
A
Yes, it is a supported practice for multiple streaming jobs writing to the same destination Delta table to share a single checkpoint location.
B
No, each Structured Streaming job must use its own unique checkpoint directory to maintain independent state and track offsets correctly.
C
No, because Delta Lake leverages its internal transaction log for state tracking, an external checkpoint directory is not required and will cause conflicts.
D
Yes, sharing a checkpoint is possible as long as the year_week partitions are distinct for each job's write operations._
E
Technically, this layout works for small batches, but assigning each job its own checkpoint folder is only recommended to improve fault isolation and debugging.