
Ultimate access to all questions.
A data engineering team is migrating an enterprise system comprising thousands of tables and views into a Medallion Lakehouse architecture. The design utilizes Bronze tables for raw production data, Silver tables for validated ML and engineering workflows, and Gold tables for business intelligence and reporting. While PII is present across all tiers, strict pseudonymization and anonymization rules are enforced at the Silver and Gold levels.
To minimize security risks and enhance cross-team collaboration, which of the following reflects a best practice for implementing this system?
A
Centralize all production tables within a single database to provide a unified perspective of all data assets, streamlining discoverability by granting all users view privileges on this database.
B
Segregate tables into separate databases (schemas) based on data quality tiers to simplify permissions management via database-level ACLs and allow for physical isolation of default storage locations.
C
Utilize the default Databricks database for managed tables, as storing data in the DBFS root provides the most robust security framework for enterprise-scale systems.
D
Avoid complex database organization, as databases on Databricks are primarily logical constructs and do not significantly impact the security or discoverability of assets within the Lakehouse.
E
Provision separate storage containers for every database created, as Unity Catalog requires a 1:1 mapping between storage locations and databases to ensure data isolation.