AWS Certified Data Engineer - Associate

AWS Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


Your company is looking to implement a data lake solution for storing and processing large volumes of structured and unstructured data. Which of the following AWS services would you recommend for this use case, and explain your reasoning?




Explanation:

Option A is the correct choice for implementing a data lake solution. Amazon S3 is a highly scalable and durable object storage service that can store large volumes of structured and unstructured data. It can handle various data formats and is cost-effective for long-term storage. Amazon EMR is a powerful data processing and analysis service that can process and analyze data stored in S3 using frameworks like Apache Spark, Hadoop, or PrestoSQL. This combination of services provides a flexible and scalable solution for data lake storage and processing. Option B is not suitable for unstructured data, as RDS is designed for relational databases. Option C is a good choice for data warehousing but may not be as flexible for unstructured data. Option D is useful for real-time data processing but may not be as cost-effective for large volumes of data storage.