
Ultimate access to all questions.
A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:
DROP TABLE IF EXISTS my_table;
DROP TABLE IF EXISTS my_table;
After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.
Which of the following describes why all of these files were deleted?
A
The table was managed
B
The table's data was smaller than 10 GB
C
The table's data was larger than 10 GB
D
The table was external
E
The table did not have a location
Explanation:
In Databricks/Spark SQL, there are two types of tables:
Managed Tables (also called internal tables):
DROP TABLE, both the metadata (table definition) AND the underlying data files are deletedExternal Tables (also called unmanaged tables):
DROP TABLE, only the metadata (table definition) is deletedIn this scenario, since both the metadata files AND data files were deleted from the file system after running DROP TABLE IF EXISTS my_table;, this indicates that my_table was a managed table.
Key points:
Additional context:
LOCATION clause or with data stored in the default warehouse directoryLOCATION clause pointing to external storageIF EXISTS clause simply prevents an error if the table doesn't exist, but doesn't affect whether data gets deleted