
Answer-first summary for fast verification
Answer: Segregate tables into separate databases (schemas) based on data quality tiers to simplify permissions management via database-level ACLs and allow for physical isolation of default storage locations.
### Explanation **Correct Answer: B** In a Databricks Lakehouse environment using Unity Catalog, segregating tables into separate databases (or schemas) based on Medallion tiers (Bronze, Silver, Gold) is the recommended best practice for several reasons: 1. **Simplified Governance (ACLs):** Unity Catalog uses a hierarchical permission model. Granting privileges at the database level automatically applies to all current and future tables within it. This allows administrators to enforce stricter controls on sensitive tiers (Bronze) while keeping higher-quality tiers (Gold) broadly accessible for BI needs without managing thousands of individual table ACLs. 2. **Managed Storage Isolation:** You can assign distinct managed storage locations to each schema. This ensures that Bronze, Silver, and Gold data physically reside in separate cloud storage prefixes or buckets, reducing the 'blast radius' of potential misconfigurations and ensuring that raw PII is physically segregated from processed data. 3. **Medallion Alignment:** This structure aligns with the functional purpose of each tier—facilitating different levels of data refinement and PII handling (anonymization) at the silver and gold levels. **Why the other options are incorrect:** * **A:** Centralizing all data in one database makes it extremely difficult to apply fine-grained PII controls and increases the risk of unauthorized access to raw data. * **C:** Databricks explicitly recommends against using the workspace-local default (DBFS root) for production managed tables. DBFS root lacks the granular governance and cloud-native security controls provided by Unity Catalog. * **D:** Database organization is a critical component of security and discoverability; ignoring it forfeits the primary governance benefits of Unity Catalog. * **E:** While Unity Catalog allows for specific storage locations, it does not require a 1:1 mapping between containers and databases. Organizations can define a single managed storage location at the catalog or schema level and reuse it based on actual isolation requirements.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineering team is migrating an enterprise system comprising thousands of tables and views into a Medallion Lakehouse architecture. The design utilizes Bronze tables for raw production data, Silver tables for validated ML and engineering workflows, and Gold tables for business intelligence and reporting. While PII is present across all tiers, strict pseudonymization and anonymization rules are enforced at the Silver and Gold levels.
To minimize security risks and enhance cross-team collaboration, which of the following reflects a best practice for implementing this system?
A
Centralize all production tables within a single database to provide a unified perspective of all data assets, streamlining discoverability by granting all users view privileges on this database.
B
Segregate tables into separate databases (schemas) based on data quality tiers to simplify permissions management via database-level ACLs and allow for physical isolation of default storage locations.
C
Utilize the default Databricks database for managed tables, as storing data in the DBFS root provides the most robust security framework for enterprise-scale systems.
D
Avoid complex database organization, as databases on Databricks are primarily logical constructs and do not significantly impact the security or discoverability of assets within the Lakehouse.
E
Provision separate storage containers for every database created, as Unity Catalog requires a 1:1 mapping between storage locations and databases to ensure data isolation.