
Answer-first summary for fast verification
Answer: Applying filters on event type columns to isolate specific data subsets before downstream processing.
In a multiplexed Bronze architecture, data from various sources or event types is stored in a single table. To optimize performance, downstream consumers should filter by the specific event type column. This practice enables Spark to minimize I/O by focusing only on relevant data subsets, which significantly enhances streaming efficiency.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Which best practice should a data engineer follow to ensure efficient streaming from a multiplexed Bronze table containing multiple event types?
A
Writing every unique event type into its own dedicated physical table during initial ingestion.
B
Enforcing a static, uniform batch size across all micro-batches to stabilize throughput.
C
Applying filters on event type columns to isolate specific data subsets before downstream processing.
D
Disabling schema evolution to reduce metadata overhead and improve read performance.
No comments yet.