
Answer-first summary for fast verification
Answer: CDF is useful when only a small fraction of records are updated in each batch.
CDF is designed for sending incremental data changes to downstream tables in a multi-hop architecture, making it ideal when only a small fraction of records are updated in each batch. Such updates typically come from external sources in CDC format. However, if most records in the table are updated or the table is overwritten in each batch, CDF is not recommended. Reference: https://www.databricks.com/blog/2021/06/09/how-to-simplify-cdc-with-delta-lakes-change-data-feed.html
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
A junior data engineer suggests enabling the Change Data Feed (CDF) feature on a Type 1 table that is overwritten nightly with new data. What is the correct response to this suggestion?
A
CDF cannot be enabled on existing tables; it's only for newly created tables.
B
CDF is beneficial when the table is a Slowly Changing Dimension (SCD) of Type 2.
C
CDF is useful when only a small fraction of records are updated in each batch.
D
Table’s data changes captured by CDF can only be read in streaming mode.
E
All the above are correct responses to the data engineer‘s suggestion.
No comments yet.