
Explanation:
The optimal approach for tracking and visualizing data lineage in a Databricks and Delta Lake-based lakehouse is to use Databricks Unity Catalog. Here's why:
While manual documentation or custom logging are alternatives, they are less efficient and more error-prone compared to the automated and integrated approach offered by Unity Catalog.
Ultimate access to all questions.
No comments yet.
In a lakehouse architecture built on Databricks with Delta Lake, your team is looking to implement data lineage tracking to assess the impact of changes on data structures and pipelines. Which method would you choose to effectively track and visualize data lineage across your lakehouse?
A
Implement custom logging within your data pipelines to record data movements and transformations, analyzing logs with Azure Log Analytics.
B
Leverage Azure Purview for automated data lineage collection across your Databricks and Delta Lake environments, integrating with other Azure data services.
C
Manually document data flows and transformations in a shared document as part of the development process.
D
Utilize Databricks Unity Catalog, which provides data lineage capabilities for tracking dataset dependencies and transformations.