
Ultimate access to all questions.
A data engineer discovers that a critical field from a Kafka source was omitted during the ingestion process into a Delta Lake pipeline. While the field was present in the Kafka source, it is now missing from downstream storage. Kafka has a seven-day retention period, but the pipeline has been running for three months.
How can Delta Lake be used to prevent this type of permanent data loss in the future?
A
Ingesting all raw data and metadata into a Bronze Delta table to create a permanent, replayable history of the data state.
B
Utilizing Delta Lake schema evolution to retroactively compute values for newly added fields based on original source properties.
C
Configuring Delta Lake to automatically include every field from the source data in the ingestion layer by default.
D
Relying on the Delta log and Structured Streaming checkpoints, which maintain a complete history of the Kafka producer’s records.