Ultimate access to all questions.
A data engineer is tasked with creating an efficient data pipeline. The source system continuously generates files in a shared directory that is utilized by multiple processes. Consequently, the files should remain unchanged and will accumulate in this directory over time. The data engineer must determine which files have been newly added since the last pipeline run and configure the pipeline to exclusively ingest these new files in every subsequent run. Which of the following tools can the data engineer use to address this requirement?