
Answer-first summary for fast verification
Answer: Use a distributed processing framework like Spark to process the data from multiple sources with different formats and handle schema drift.
Option D is the correct approach as it leverages the power of distributed processing frameworks like Spark to efficiently process data from multiple sources with different formats. It also includes handling schema drift, which is essential when dealing with data from different sources that may have evolving schemas. This approach ensures that the solution can handle the diverse data formats and provide accurate results.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In a stream processing solution, you need to process data from multiple sources with different data formats. How would you approach this task to ensure efficient and accurate processing?
A
Convert all the data to a single format before processing.
B
Process each data source independently using a specific processing approach for each format.
C
Use a distributed processing framework like Spark to process the data from multiple sources with different formats.
D
Use a distributed processing framework like Spark to process the data from multiple sources with different formats and handle schema drift.