
Answer-first summary for fast verification
Answer: Relocate the .json files to a different path within the S3 bucket.
Option C is CORRECT because relocating the .json files to a different path within the S3 bucket will ensure that the AWS Glue crawler and Amazon Athena queries only include the data in the desired path. By moving the .json files to a separate location within the same bucket, the crawler and subsequent Athena queries can be configured to focus only on the path containing the .csv files. This approach is effective for excluding unwanted files from Athena queries without altering access permissions to the source S3 bucket. It ensures shorter query times by minimizing the data scanned during queries.
Author: Ritesh Yadav
Ultimate access to all questions.
Question 6/58
A data engineer is using an AWS Glue crawler to catalog data that is in an Amazon S3 bucket. The S3 bucket contains both .csv and json files. The data engineer configured the crawler to exclude the .json files from the catalog.
When the data engineer runs queries in Amazon Athena, the queries also process the excluded .json files. The data engineer wants to resolve this issue. The data engineer needs a solution that will not affect access requirements for the .csv files in the source S3 bucket.
Which solution will meet this requirement with the SHORTEST query times?
A
Adjust the AWS Glue crawler settings to ensure that the AWS Glue crawler also excludes .json files.
B
Use the Athena console to ensure the Athena queries also exclude the .json files.
C
Relocate the .json files to a different path within the S3 bucket.
D
Use S3 bucket policies to block access to the .json files.
No comments yet.