
Answer-first summary for fast verification
Answer: Temporary view
## Explanation **Correct Answer: D (Temporary view)** ### Why Temporary View is the Best Choice: 1. **Session-based**: Temporary views exist only for the duration of the current Spark session, which aligns perfectly with the requirement that "the relational object does not need to be used by other data engineers in other sessions." 2. **No Physical Storage**: Temporary views do not store physical data. They are essentially saved query definitions that reference the underlying tables. When queried, they execute the saved query against the source tables, avoiding data duplication. 3. **Cost-Effective**: Since no physical data is copied or stored, this saves on storage costs as specified in the requirement. ### Why Other Options Are Incorrect: - **A. Spark SQL Table**: Creates a physical table that stores data, which would incur storage costs. - **B. View**: While views don't store physical data, they are persistent across sessions and available to other users, which goes against the requirement of not being needed by other data engineers in other sessions. - **C. Database**: A database is a container for tables and views, not a relational object that pulls data from tables. - **E. Delta Table**: Creates a physical Delta table that stores data, incurring storage costs. ### Key Characteristics of Temporary Views: - Created using syntax like `CREATE TEMP VIEW view_name AS query` - Session-scoped (disappear when session ends) - No physical storage of data - Perfect for intermediate results within a single session - Session-ending events include: opening a new notebook, detaching/reattaching a cluster, installing Python packages, or restarting a cluster
Author: Keng Suppaseth
Ultimate access to all questions.
A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.
Which of the following relational objects should the data engineer create?
A
Spark SQL Table
B
View
C
Database
D
Temporary view
E
Delta Table
No comments yet.