
Answer-first summary for fast verification
Answer: Configure a Glue Crawler to infer schema and use custom ETL scripts for data processing.
Option B is correct because it involves configuring a Glue Crawler to infer schema and using custom ETL scripts for efficient data processing. Using Glue directly without configuration (Option A) is not practical for large datasets. Manually uploading data to Glue (Option C) is inefficient. Using Lambda for data processing (Option D) adds unnecessary complexity when Glue can handle the task.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are tasked with implementing a data ingestion pipeline that requires reading data from an Amazon S3 bucket and processing it in batches using AWS Glue. The pipeline must handle large datasets and ensure efficient data processing. Describe how you would configure AWS Glue to read data from S3, including the use of Glue Crawlers and ETL scripts.
A
Use AWS Glue directly without any configuration to process data from S3.
B
Configure a Glue Crawler to infer schema and use custom ETL scripts for data processing.
C
Manually upload data to AWS Glue from S3 without using Crawlers.
D
Use AWS Lambda to process data from S3 and then use AWS Glue for ETL.