
Ultimate access to all questions.
A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:
DROP TABLE IF EXISTS my_table;
DROP TABLE IF EXISTS my_table;
After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.
What is the reason behind the deletion of all these files?
A
The table was managed
B
The table's data was smaller than 10 GB
C
The table did not have a location
D
The table was external
Explanation:
In Databricks/Spark SQL, there are two types of tables:
Managed Tables: These are tables where Spark manages both the metadata AND the data. When you drop a managed table using DROP TABLE, Spark deletes both the metadata from the metastore AND the underlying data files from the storage.
External Tables: These are tables where Spark only manages the metadata, while the data is stored in an external location that you specify. When you drop an external table, Spark only deletes the metadata from the metastore, but leaves the underlying data files intact.
In this scenario:
DROP TABLE IF EXISTS my_table;my_table was a managed tableKey points:
Best Practice: Always be aware of whether you're working with managed or external tables, especially when performing destructive operations like DROP TABLE.