
Ultimate access to all questions.
You are a data engineer working on a project that requires efficient data versioning for a large dataset stored in Delta Lake. The project has strict requirements for cost efficiency, compliance with data governance policies, and the ability to scale as the dataset grows. Considering these constraints, which of the following approaches BEST utilizes Delta Lake tables for efficient data versioning? Choose the most appropriate option and explain why it is the best choice under the given constraints.
A
Create separate Delta Lake tables for each version of the dataset, ensuring isolation but increasing storage costs and complexity in management.
B
Maintain a single version of the dataset in one Delta Lake table, simplifying management but losing the ability to track historical changes.
C
Use Delta Lake's time travel feature to store multiple versions of the dataset in the same table, with each version accessible through versioning metadata, optimizing for cost and compliance.
D
Store each version of the dataset as a separate column within the same Delta Lake table, which complicates queries and increases storage overhead.