Ultimate access to all questions.
You are designing a data engineering solution on Azure Databricks that requires high data integrity and reliability for a financial reporting system. The solution must comply with strict regulatory requirements, including GDPR, and should be scalable to handle petabytes of data. Which of the following strategies would you employ to ensure the integrity and reliability of data in Delta Lake, considering the need for transactional consistency, schema enforcement, and the ability to audit data changes over time? Choose the best option.
Explanation:
The correct answer is C because Delta Lake's built-in features such as ACID transactions, schema enforcement, and time travel capabilities are designed to ensure data integrity and reliability, making them ideal for compliance with strict regulatory requirements like GDPR. Option A is insufficient as automatic optimizations may not fully address all compliance and integrity needs without proper configuration. Option B introduces unnecessary complexity and potential for error, making it less efficient than leveraging Delta Lake's native capabilities. Option D is incorrect because sacrificing data integrity and compliance for performance is not acceptable in a financial reporting system.