
Answer-first summary for fast verification
Answer: Use AWS Glue crawlers' parallel scanning feature to scan multiple data sources simultaneously.
To optimize the performance of AWS Glue crawlers, you should use the parallel scanning feature, which allows multiple data sources to be scanned simultaneously. This can significantly reduce the time required to discover schemas and populate the data catalog. Increasing the frequency of crawler runs may lead to unnecessary resource usage, while limiting the number of data sources or disabling schema discovery can result in an outdated or incomplete data catalog.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are tasked with optimizing the performance of AWS Glue crawlers when discovering schemas and populating the data catalog. What strategies can you implement to achieve this?
A
Increase the frequency of crawler runs to ensure the data catalog is always up-to-date.
B
Use AWS Glue crawlers' parallel scanning feature to scan multiple data sources simultaneously.
C
Limit the number of data sources scanned by each crawler to reduce the load on the system.
D
Disable schema discovery for data sources that do not require frequent updates.