
Answer-first summary for fast verification
Answer: Use an AWS Glue crawler to scan the S3 bucket and infer the schema from the CSV file.
The most efficient way to discover and populate the schema of a new data source in the AWS Glue Data Catalog is to use an AWS Glue crawler. By configuring the crawler to scan the S3 bucket containing the CSV file, it can automatically infer and populate the data catalog with the schema. Manually defining the schema or creating a new AWS Glue job would be time-consuming and error-prone. Uploading the CSV file through the AWS Glue console is not a direct method for populating the data catalog with the schema.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You need to discover the schema of a new data source in the AWS Glue Data Catalog. The data source is stored in an S3 bucket in CSV format. What is the most efficient way to populate the data catalog with the schema?
A
Manually define the schema in the AWS Glue Data Catalog based on the CSV file structure.
B
Use an AWS Glue crawler to scan the S3 bucket and infer the schema from the CSV file.
C
Create an AWS Glue job that reads the CSV file and extracts the schema, then use the output to update the data catalog.
D
Use the AWS Glue console to upload the CSV file and automatically generate the schema in the data catalog.
No comments yet.