
Answer-first summary for fast verification
Answer: Table
## Explanation The correct answer is **A. Table** because: 1. **Persistent across sessions**: Tables are saved to physical storage (like DBFS, cloud storage) and persist beyond the current Spark session, allowing other data engineers to access them in different sessions. 2. **Physical storage requirement**: The question explicitly states the data entity "must be saved to a physical location." Tables are physically stored data structures, while views are logical abstractions that don't store data physically. 3. **Multi-table composition**: The data engineer wants to create a data entity "from a couple of tables," which suggests they need to combine data from multiple source tables into a new persistent structure. **Why other options are incorrect**: - **B. Function**: Functions are reusable code/logic, not data entities that store or combine data from tables. - **C. View**: Views are logical representations that don't store data physically. They are saved as metadata but the underlying data isn't duplicated to a physical location. - **D. Temporary view**: Temporary views are session-scoped and don't persist beyond the current session, so they cannot be used by other data engineers in other sessions. **Best Practice**: When you need to create a reusable data entity from multiple tables that persists across sessions and is physically stored, you should create a table (either managed or external). This could be done using `CREATE TABLE AS SELECT` or by writing the results of a query to a table location.
Author: Keng Suppaseth
Ultimate access to all questions.
A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.
Which of the following data entities should the data engineer create?
A
Table
B
Function
C
View
D
Temporary view
No comments yet.