Ultimate access to all questions.
Your company is building a data lake on AWS to store and process data from various sources, including IoT devices, web applications, and third-party APIs. The data lake should be able to handle both batch and real-time data ingestion. Which AWS service would you recommend for this scenario, and how would you design the data ingestion pipeline to support both batch and real-time data ingestion?
Explanation:
In this scenario, using Amazon S3 for data storage, Amazon Kinesis for real-time data ingestion, and AWS Data Pipeline for batch data ingestion would be the most appropriate choice. Amazon S3 provides a scalable and durable storage solution for the data lake. Amazon Kinesis can handle real-time data ingestion from various sources, while AWS Data Pipeline can manage batch data ingestion. This combination of services supports both batch and real-time data ingestion, making it suitable for the data lake requirements.