AWS Certified Cloud Practitioner

Get started today

Ultimate access to all questions.

Quick Answer

Answer-first summary for fast verification

Answer: Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to write the data into the data lake in Apache Parquet format.

Option D is CORRECT because creating an AWS Glue ETL job to read from the .csv structured data source and configuring the job to write the data into the data lake in Apache Parquet format will be the most cost-effective solution. Apache Parquet is a columnar storage file format that significantly reduces the amount of data read by Amazon Athena queries, especially when querying only one or two columns. This leads to lower query costs and better performance.

Author: Ritesh Yadav

Quick Answer

Answer-first summary for fast verification

Answer: Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to write the data into the data lake in Apache Parquet format.

Author: Ritesh Yadav

Comments (0)

No comments yet.

Question 12/60

A data engineer must ingest a source of structured data that is in .csv format into an Amazon S3 data lake. The .csv files contain 15 columns. Data analysts need to run Amazon Athena queries on one or two columns of the dataset. The data analysts rarely query the entire file.

Which solution will meet these requirements MOST cost-effectively?

Real Exam

Community

RRitesh

Last updated: April 29, 2026 at 06:52

Use an AWS Glue PySpark job to ingest the source data into the data lake in .csv format.

Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to ingest the data into the data lake in JSON format.

Use an AWS Glue PySpark job to ingest the source data into the data lake in Apache Avro format.

Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to write the data into the data lake in Apache Parquet format.