
Answer-first summary for fast verification
Answer: Leverage the built-in data catalog integration in the AWS Glue job to automatically discover and reference the data sources.
To ensure that data is properly referenced and consumed from the AWS Glue Data Catalog in a data pipeline, you should leverage the built-in data catalog integration in the AWS Glue job. This allows the job to automatically discover and reference the data sources in the catalog, eliminating the need for manual configuration or external files. While using the AWS Glue console or creating a centralized configuration file may work, they are not as efficient or scalable as leveraging the built-in integration.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your team is working on a data pipeline that requires consuming data from multiple sources in the AWS Glue Data Catalog. How can you ensure that the data is properly referenced and consumed by the pipeline?
A
Use the AWS Glue console to manually reference each data source in the pipeline.
B
Create a centralized configuration file that lists all the data sources in the data catalog and use it in the pipeline.
C
Use the AWS Glue API to programmatically reference the data sources in the data catalog and consume the data.
D
Leverage the built-in data catalog integration in the AWS Glue job to automatically discover and reference the data sources.
No comments yet.