Ultimate access to all questions.
You are tasked with implementing a data ingestion pipeline that requires reading data from an Amazon S3 bucket and processing it in batches using AWS Glue. The pipeline must handle large datasets and ensure efficient data processing. Describe how you would configure AWS Glue to read data from S3, including the use of Glue Crawlers and ETL scripts.
Explanation:
Option B is correct because it involves configuring a Glue Crawler to infer schema and using custom ETL scripts for efficient data processing. Using Glue directly without configuration (Option A) is not practical for large datasets. Manually uploading data to Glue (Option C) is inefficient. Using Lambda for data processing (Option D) adds unnecessary complexity when Glue can handle the task.