Ultimate access to all questions.
Your company is planning to migrate a large dataset from a CSV format to Apache Parquet for better performance and efficiency in data processing. You are tasked with designing a data transformation pipeline using AWS services. What steps should you take to ensure the pipeline is optimized for performance and cost-effectiveness?
Explanation:
Option B is the most appropriate choice for this scenario. AWS Glue is a fully managed extract, transform, and load (ETL) service that can handle large-scale data processing tasks efficiently. By enabling dynamic frame optimization, Glue can automatically optimize the data transformation process, resulting in better performance. Storing the result in Amazon S3 ensures cost-effectiveness and easy accessibility for further processing or analysis.