
Answer-first summary for fast verification
Answer: Use a distributed processing framework like Spark to process the data from multiple sources with varying data quality and implement data cleaning and preprocessing techniques.
Option D is the correct approach as it leverages the power of distributed processing frameworks like Spark to efficiently process data from multiple sources with varying data quality. It also includes implementing data cleaning and preprocessing techniques, which are essential for ensuring accurate results. This approach ensures that the solution can handle the data quality issues and provide accurate insights.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a stream processing solution, you need to process data from multiple sources with varying data quality. How would you approach this task to ensure efficient and accurate processing?
A
Ignore the data quality issues and process the data as is.
B
Process each data source independently using a specific processing approach for each data quality issue.
C
Use a distributed processing framework like Spark to process the data from multiple sources with varying data quality.
D
Use a distributed processing framework like Spark to process the data from multiple sources with varying data quality and implement data cleaning and preprocessing techniques.
No comments yet.