
Ultimate access to all questions.
A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.
Which of the following data entities should the data engineer create?
A
Table
B
Function
C
View
D
Temporary view
Explanation:
The correct answer is A. Table because:
Persistent across sessions: Tables are saved to physical storage (like DBFS, cloud storage) and persist beyond the current Spark session, allowing other data engineers to access them in different sessions.
Physical storage requirement: The question explicitly states the data entity "must be saved to a physical location." Tables are physically stored data structures, while views are logical abstractions that don't store data physically.
Multi-table composition: The data engineer wants to create a data entity "from a couple of tables," which suggests they need to combine data from multiple source tables into a new persistent structure.
Why other options are incorrect:
Best Practice: When you need to create a reusable data entity from multiple tables that persists across sessions and is physically stored, you should create a table (either managed or external). This could be done using CREATE TABLE AS SELECT or by writing the results of a query to a table location.