
Ultimate access to all questions.
A junior data engineer needs to create a Spark SQL table my_table for which Spark manages both the data and the metadata. The metadata and data should also be stored in the Databricks Filesystem (DBFS).
Which of the following commands should a senior data engineer share with the junior data engineer to complete this task?
A
CREATE TABLE my_table (id STRING, value STRING) USING org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
B
CREATE MANAGED TABLE my_table (id STRING, value STRING) USING org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
C
CREATE MANAGED TABLE my_table (id STRING, value STRING);
D
CREATE TABLE my_table (id STRING, value STRING) USING DBFS;
E
CREATE TABLE my_table (id STRING, value STRING);
Explanation:
In Databricks, a managed table is one where Spark manages both the data and metadata. When you create a table without specifying a location using CREATE TABLE, it becomes a managed table by default, and the data is stored in the Databricks Filesystem (DBFS) under the default location.
Let's analyze each option:
A. CREATE TABLE my_table (id STRING, value STRING) USING org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
B. CREATE MANAGED TABLE my_table (id STRING, value STRING) USING org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
CREATE MANAGED TABLE is not valid Spark SQL syntax. The correct way to create a managed table is simply CREATE TABLE without specifying a location.C. CREATE MANAGED TABLE my_table (id STRING, value STRING);
CREATE MANAGED TABLE is not a valid SQL command in Spark.D. CREATE TABLE my_table (id STRING, value STRING) USING DBFS;
USING DBFS is not a valid format specification in Spark SQL.E. CREATE TABLE my_table (id STRING, value STRING);
dbfs:/user/hive/warehouse/ or similar).Key Points:
CREATE TABLE syntax without location specification creates a managed table stored in DBFS