Ultimate access to all questions.
A data engineer is tasked with creating a data pipeline that ingests data from a high-velocity source system, generating millions of files daily stored in cloud storage. The goal is to identify and ingest only new files since the last pipeline run incrementally, while also accommodating expected schema changes over time. Which technique should the data engineer use to address these requirements?