AWS Certified Solutions Architect - Associate

Get started today

Ultimate access to all questions.

Explanation:

Explanation

Correct Answer: A

Why Option A is correct:

AWS Glue is a serverless ETL service that requires minimal operational overhead - no infrastructure to manage
Scheduled execution allows for automated processing of CSV files
Direct integration with Amazon Redshift - the COTS application can query data from Redshift
AWS Glue can process CSV files and transform them into a format suitable for Redshift
Least operational overhead compared to other options since AWS Glue is fully managed

Why other options are incorrect:

Option B:

Requires managing EC2 instances (operational overhead for patching, scaling, monitoring)
Requires developing and maintaining custom Python scripts
Cron scheduling on EC2 is less reliable than AWS managed scheduling
Converting to .sql files may not be the optimal format for the COTS application

Option C:

DynamoDB is not mentioned as a supported data source for the COTS application (only Redshift and S3)
Lambda functions have execution time limits (15 minutes) which may be insufficient for large ETL jobs
Requires managing DynamoDB table capacity and costs
More complex architecture than needed

Option D:

Amazon EMR requires significant operational overhead (cluster management, scaling, monitoring)
Weekly schedule may not meet real-time or frequent processing needs
EMR clusters are expensive to run continuously and require management
Overkill for simple CSV transformation tasks

Key AWS Services Considered:

AWS Glue: Serverless ETL service perfect for scheduled data transformation
Amazon Redshift: Data warehouse where COTS application can query data
Amazon S3: Source storage for CSV files

Architecture Pattern: Legacy App → S3 (CSV files) → AWS Glue (scheduled ETL) → Amazon Redshift → COTS Application (SQL queries)

Explanation:

Explanation

Correct Answer: A

Why Option A is correct:

AWS Glue is a serverless ETL service that requires minimal operational overhead - no infrastructure to manage
Scheduled execution allows for automated processing of CSV files
Direct integration with Amazon Redshift - the COTS application can query data from Redshift
AWS Glue can process CSV files and transform them into a format suitable for Redshift
Least operational overhead compared to other options since AWS Glue is fully managed

Why other options are incorrect:

Option B:

Requires managing EC2 instances (operational overhead for patching, scaling, monitoring)
Requires developing and maintaining custom Python scripts
Cron scheduling on EC2 is less reliable than AWS managed scheduling
Converting to .sql files may not be the optimal format for the COTS application

Option C:

DynamoDB is not mentioned as a supported data source for the COTS application (only Redshift and S3)
Lambda functions have execution time limits (15 minutes) which may be insufficient for large ETL jobs
Requires managing DynamoDB table capacity and costs
More complex architecture than needed

Option D:

Amazon EMR requires significant operational overhead (cluster management, scaling, monitoring)
Weekly schedule may not meet real-time or frequent processing needs
EMR clusters are expensive to run continuously and require management
Overkill for simple CSV transformation tasks

Key AWS Services Considered:

AWS Glue: Serverless ETL service perfect for scheduled data transformation
Amazon Redshift: Data warehouse where COTS application can query data
Amazon S3: Source storage for CSV files

Architecture Pattern: Legacy App → S3 (CSV files) → AWS Glue (scheduled ETL) → Amazon Redshift → COTS Application (SQL queries)

Comments (0)

No comments yet.

A company uses a legacy application to produce data in CSV format. The legacy application stores the output data in Amazon S3. The company is deploying a new commercial off-the-shelf (COTS) application that can perform complex SQL queries to analyze data that is stored in Amazon Redshift and Amazon S3 only. However, the COTS application cannot process the .csv files that the legacy application produces.

The company cannot update the legacy application to produce data in another format. The company needs to implement a solution so that the COTS application can use the data that the legacy application produces.

Which solution will meet these requirements with the LEAST operational overhead?

Other

Community

UAnonymous

Last updated: February 23, 2026 at 11:39

Create an AWS Glue extract, transform, and load (ETL) job that runs on a schedule. Configure the ETL job to process the .csv files and store the processed data in Amazon Redshift.