
AWS Certified Data Engineer - Associate
Get started today
Ultimate access to all questions.
You are tasked with optimizing a Python script that ingests large CSV files into an Amazon Redshift cluster. The current script takes over 2 hours to complete the ingestion for a 1GB file. Which of the following strategies would you consider to reduce the runtime?
You are tasked with optimizing a Python script that ingests large CSV files into an Amazon Redshift cluster. The current script takes over 2 hours to complete the ingestion for a 1GB file. Which of the following strategies would you consider to reduce the runtime?
Simulated
Explanation:
All the options mentioned are valid strategies for optimizing the runtime of a data ingestion script. AWS Glue can automate and parallelize the ETL process, increasing nodes in Redshift can distribute the load, and multiprocessing in Python can handle multiple files concurrently, thereby reducing the overall runtime.