
Answer-first summary for fast verification
Answer: Leverage Apache Kafka for real-time data ingestion, use AWS Glue for ETL, and Amazon Redshift for real-time analytics.
Option B is the optimal solution as it uses Apache Kafka for handling real-time data streams, AWS Glue for efficient ETL processes, and Amazon Redshift for real-time analytics, ensuring a robust and integrated system.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a data processing system for a healthcare provider that needs to analyze large volumes of patient data, including electronic health records (EHRs) and genomic data. The system must handle both structured and unstructured data and provide real-time analytics. Describe how you would design this system, focusing on the technologies and architectures you would use to achieve real-time processing and data integration.
A
Use batch processing with SQL databases and ignore unstructured data.
B
Leverage Apache Kafka for real-time data ingestion, use AWS Glue for ETL, and Amazon Redshift for real-time analytics.
C
Store all data in a single database and process it using scheduled batch jobs.
D
Manually process each data source separately without integrating them.
No comments yet.