
Ultimate access to all questions.
In the context of designing a data pipeline that ingests data from multiple sources into a Delta table on Azure Databricks, you are tasked with ensuring the highest level of data quality and accuracy. The solution must leverage Delta Lake's features effectively, considering constraints such as cost, compliance with data governance policies, and scalability for large volumes of data. Which of the following approaches BEST utilizes Delta Lake's capabilities to meet these requirements? (Choose one option)
A
Implementing external data validation tools before ingestion to ensure data quality, bypassing Delta Lake's built-in features to reduce processing time.
B
Utilizing Delta Lake's schema enforcement feature to automatically reject any data that does not conform to the predefined schema, ensuring data integrity.
C
Configuring the pipeline to skip schema validation to maximize ingestion speed, relying on post-ingestion data cleaning processes to address any quality issues.
D
Enabling Delta Lake's transaction log but limiting its size to reduce storage costs, accepting the risk of minor data inconsistencies.