
Ultimate access to all questions.
A data engineer needs to create a table in Databricks using data from a CSV file at location /path/to/csv. They run the following command:
CREATE TABLE new_table
_______
OPTIONS (
header = "true",
delimiter = "|"
)
LOCATION "path/to/csv"
CREATE TABLE new_table
_______
OPTIONS (
header = "true",
delimiter = "|"
)
LOCATION "path/to/csv"
Which of the following lines of code fills in the above blank to successfully complete the task?
A
None of these lines of code are needed to successfully complete the task
B
USING CSV
C
FROM CSV
D
USING DELTA
E
FROM "path/to/csv"
Explanation:
In Databricks SQL, when creating a table from a CSV file using the CREATE TABLE statement with the LOCATION clause, you need to specify the data source format using the USING keyword.
The correct syntax is:
CREATE TABLE table_name
USING data_source_format
OPTIONS (options)
LOCATION "path/to/data"
CREATE TABLE table_name
USING data_source_format
OPTIONS (options)
LOCATION "path/to/data"
For CSV files, the correct format specification is USING CSV. This tells Databricks that the data at the specified location is in CSV format and should be parsed accordingly.
Let's examine why the other options are incorrect:
USING clause is required to specify the data source format.FROM is used in SELECT statements, not in CREATE TABLE statements for specifying data source format.The complete correct statement would be:
CREATE TABLE new_table
USING CSV
OPTIONS (
header = "true",
delimiter = "|"
)
LOCATION "path/to/csv"
CREATE TABLE new_table
USING CSV
OPTIONS (
header = "true",
delimiter = "|"
)
LOCATION "path/to/csv"
This creates an external table that references the CSV file at the specified location, with the specified options for parsing (header row present, pipe delimiter).