
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
An online retail company has more than 50 million active customers and receives more than 25,000 orders each day. The company collects purchase data for customers and stores this data in Amazon S3. Additional customer data is stored in Amazon RDS.
The company wants to make all the data available to various teams so that the teams can perform analytics. The solution must provide the ability to manage fine-grained permissions for the data and must minimize operational overhead.
Which solution will meet these requirements?
A
Migrate the purchase data to write directly to Amazon RDS. Use RDS access controls to limit access.
B
Schedule an AWS Lambda function to periodically copy data from Amazon RDS to Amazon S3. Create an AWS Glue crawler. Use Amazon Athena to query the data. Use S3 policies to limit access.
C
Create a data lake by using AWS Lake Formation. Create an AWS Glue JDBC connection to Amazon RDS. Register the S3 bucket in Lake Formation. Use Lake Formation access controls to limit access.
D
Create an Amazon Redshift cluster. Schedule an AWS Lambda function to periodically copy data from Amazon S3 and Amazon RDS to Amazon Redshift. Use Amazon Redshift access controls to limit access.
Explanation:
Correct Answer: C
AWS Lake Formation is specifically designed to address the requirements in this scenario:
Option A: Migrating purchase data to RDS would not be scalable for the volume described (50M customers, 25K orders/day). RDS is not designed for analytics at this scale, and RDS access controls are not as fine-grained as needed for analytics teams.
Option B: While this solution uses Athena for querying, S3 policies alone don't provide fine-grained, table-level permissions. S3 policies work at the bucket/object level, not at the column/row level needed for analytics teams.
Option D: Amazon Redshift is a data warehouse solution, but it requires significant operational overhead for data loading and management. Redshift access controls are also not as fine-grained as Lake Formation's capabilities, and maintaining a Redshift cluster adds operational complexity.
This solution provides the scalability needed for large datasets while offering the fine-grained permissions required for multiple analytics teams, all with minimal operational overhead.