
Ultimate access to all questions.
In a scenario where you are managing a Delta Lake table within Azure Databricks and need to enforce high data quality standards to comply with GDPR requirements, you are tasked with selecting the most effective method to prevent the insertion of bad data. The solution must not only ensure data integrity but also be cost-effective and scalable to handle large volumes of data. Considering these constraints, which of the following approaches would BEST meet these requirements? (Choose one option)
A
Use the CHECK constraint to validate data at the time of insertion, ensuring that only data meeting specific conditions is written to the table.
B
Implement a custom validation function and use the WITH WATERMARK clause to enforce data quality, which allows for more complex validation logic but may introduce additional processing overhead.
C
Leverage the NOT NULL constraint to ensure mandatory fields are not empty, a basic but effective method for enforcing data quality on specific columns.
D
Utilize the UNIQUE constraint to prevent duplicate records from being written, which is useful for maintaining data uniqueness but does not address other data quality issues.