Ultimate access to all questions.
You are working on a data processing project that involves analyzing social media posts to identify trends and sentiment. The data includes a mix of structured and unstructured data, with high velocity and volume. Describe how you would design an ETL pipeline to handle this data, and explain the role of intermediate data staging locations in the pipeline.
Explanation:
Option B is the correct answer. A multi-stage ETL pipeline with intermediate data staging locations is necessary to handle the different data types and sources. Intermediate data staging locations allow for data to be processed and transformed at each stage, making it easier to manage and optimize the pipeline. Ignoring unstructured data or using a single-stage ETL process would not be sufficient for the given requirements. Traditional batch processing may not be able to handle the velocity and volume of the data effectively.