
Answer-first summary for fast verification
Answer: Use a data skew detection mechanism that identifies and redistributes skewed data in real-time to balance the load across processing nodes.
Option B is the correct approach for handling data skew in real-time streaming data pipelines. A data skew detection mechanism can identify and redistribute skewed data to balance the load across processing nodes, ensuring efficient processing. Option A is important for data quality but may not be feasible in real-time scenarios. Option C is incorrect as data quality and skew are critical for the performance of streaming data pipelines. Option D is not scalable and may not identify all issues.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a data pipeline that processes streaming data, how can you ensure data quality and handle data skew in real-time?
A
Implement a data validation process that checks for data completeness, consistency, accuracy, and integrity in real-time as the data is ingested.
B
Use a data skew detection mechanism that identifies and redistributes skewed data in real-time to balance the load across processing nodes.
C
Ignore data quality and skew issues in streaming data pipelines, focusing only on the speed of data processing.
D
Manually inspect a sample of the streaming data to ensure quality and consistency before processing.
No comments yet.