
Ultimate access to all questions.
A table customerLocations exists with the following schema:
id STRING,
date STRING,
city STRING,
country STRING
id STRING,
date STRING,
city STRING,
country STRING
A senior data engineer wants to create a new table from this table using the following command:
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
A junior data engineer asks why the schema is not being declared for the new table.
Which of the following responses explains why declaring the schema is not necessary?
A
CREATE TABLE AS SELECT statements adopt schema details from the source table and query.
B
CREATE TABLE AS SELECT statements infer the schema by scanning the data.
Explanation:
Correct Answer: A
In Databricks and Spark SQL, when using CREATE TABLE AS SELECT (CTAS) statements:
Schema Inference: The schema for the new table is automatically inferred from the result set of the SELECT query, not by scanning the actual data (which would be Option B).
Schema Adoption: The new table adopts the column names, data types, and other schema details from the SELECT statement's result columns. In this case:
country column inherits the STRING type from the source table's country columncustomers column gets the BIGINT type because COUNT(*) returns a BIGINTWhy Other Options Are Incorrect:
Key Concept: CTAS statements in Databricks automatically derive the schema from the SELECT query's result structure, eliminating the need for explicit schema declaration when creating tables from existing data.