Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

A data architect has designed a system where two Structured Streaming jobs concurrently write to a single bronze Delta table. Each job consumes data from a different Apache Kafka topic but writes records with identical schemas. To simplify the directory structure, a data engineer proposes using a shared nested checkpoint directory for both streams, as shown below:

/bronze
  -_checkpoint
  - delta_log
  - year_week=2020_01
  - year_week=2020_02

/bronze
  -_checkpoint
  - delta_log
  - year_week=2020_01
  - year_week=2020_02

Is this checkpoint directory structure valid for the given scenario? Explain why or why not.

Exam-Like

No; Delta Lake manages streaming checkpoints in the transaction log.

5.1%

Yes; both of the streams can share a single checkpoint directory.

7.6%

Comments

Loading comments...