
Answer-first summary for fast verification
Answer: Database
## Explanation **Correct Answer: A (Database)** **Why Database is the correct choice:** 1. **Persistent Storage**: A database in Databricks is a logical collection of tables that is stored in a physical location (typically in Unity Catalog or the Hive metastore). It persists across sessions and can be accessed by multiple users. 2. **Cross-session Accessibility**: Databases are catalog objects that can be accessed by other data engineers in different sessions, as they are registered in the metastore. 3. **Physical Storage**: Databases have associated physical storage locations where table data is actually stored (in cloud storage like S3, ADLS, etc.). 4. **Table Aggregation**: A database can contain multiple tables, which aligns with the requirement of creating a data entity "from a couple of tables." **Why Function is incorrect:** 1. **Not a Data Entity**: Functions (UDFs - User Defined Functions) are code objects, not data entities. They are used to transform data, not to store or organize data. 2. **No Physical Storage**: Functions don't have physical storage locations for data; they are stored as code definitions in the metastore. 3. **Different Purpose**: Functions are used for data processing and transformation, not for organizing and persisting data structures. **Additional Context:** In Databricks, databases (also called schemas in some contexts) are the primary way to organize tables logically. They provide: - Namespace isolation - Access control at the database level - Physical storage management - Metadata management through Unity Catalog The data engineer should create a database to organize the tables, which will then be accessible to other engineers across different sessions and will have physical storage backing.
Author: Keng Suppaseth
Ultimate access to all questions.
A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location. Which of the following data entities should the data engineer create?
A
Database
B
Function
No comments yet.