
Ultimate access to all questions.
An upstream system provides Change Data Capture (CDC) logs representing INSERT, UPDATE, and DELETE operations for a source table with a primary key (pk_id). The data governance team requires a solution that preserves a full historical audit trail while also maintaining a table containing only the most recent records for analytical queries. Data is ingested hourly, and individual records may undergo multiple changes within a single ingestion window.
Which solution effectively meets these requirements while following Databricks Medallion Architecture best practices?_
A
Ingest all incoming CDC logs into an append-only bronze table to maintain a full history, then use MERGE INTO to upsert the most recent version of each pk_id into a silver table to represent the current state._
B
Utilize MERGE INTO directly on a bronze table to handle incoming CDC operations for each pk_id, then propagate these changes downstream to minimize storage requirements._
C
Implement Delta Lake Change Data Feed (CDF) to automatically ingest and process external CDC logs directly from the source system into the Lakehouse.
D
Sequence the incoming changes chronologically and apply them to a target table, relying on Delta Lake’s internal table versioning and time travel to serve as the required audit log.
E
Create separate history tables for every unique pk_id and resolve the current state by performing a UNION operation of the most recent entries during query execution._