
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
A company has an AWS Glue extract, transform, and load (ETL) job that runs every day at the same time. The job processes XML data that is in an Amazon S3 bucket. New data is added to the S3 bucket every day. A solutions architect notices that AWS Glue is processing all the data during each run.
What should the solutions architect do to prevent AWS Glue from reprocessing old data?
A
Edit the job to use job bookmarks.
B
Edit the job to delete data after the data is processed.
C
Edit the job by setting the NumberOfWorkers field to 1.
D
Use a FindMatches machine learning (ML) transform.
Explanation:
Correct Answer: A - Edit the job to use job bookmarks.
Why this is correct:
Why other options are incorrect:
B - Edit the job to delete data after the data is processed:
C - Edit the job by setting the NumberOfWorkers field to 1:
D - Use a FindMatches machine learning (ML) transform:
Key AWS Glue Concepts:
Best Practice: Always enable job bookmarks for recurring ETL jobs that process new data incrementally to optimize costs and processing time.