
Ultimate access to all questions.
You are working on a data processing project that involves analyzing social media posts to identify trends and sentiment. The data includes a mix of structured and unstructured data, with high velocity and volume. Describe how you would design an ETL pipeline to handle this data, and explain the role of intermediate data staging locations in the pipeline.
A
Use a single-stage ETL process to load the data directly into a data lake and perform all transformations and analysis there.
B
Create a multi-stage ETL pipeline with intermediate data staging locations to handle the different data types and sources, and perform transformations at each stage.
C
Only process structured data and ignore unstructured data due to the complexity of handling different data types.
D
Use a traditional batch processing approach to handle the data, as it is more cost-effective than using a distributed computing framework like Apache Spark.