
Answer-first summary for fast verification
Answer: CREATE TABLE AS SELECT statements adopt schema details from the source table and query.
## Explanation **Correct Answer: A** In Databricks and Spark SQL, when using `CREATE TABLE AS SELECT` (CTAS) statements: 1. **Schema Inference**: The schema for the new table is automatically inferred from the result set of the SELECT query, not by scanning the actual data (which would be Option B). 2. **Schema Adoption**: The new table adopts the column names, data types, and other schema details from the SELECT statement's result columns. In this case: - `country` column inherits the STRING type from the source table's `country` column - `customers` column gets the BIGINT type because COUNT(*) returns a BIGINT 3. **Why Other Options Are Incorrect**: - **Option B**: While schema inference happens, it's not by "scanning the data" but by analyzing the query's result schema. - **Option C**: Schemas are not optional; every table has a schema in Databricks. - **Option D**: Columns do not default to STRING; they inherit appropriate types from the query. - **Option E**: All tables in Databricks support schemas; this is a fundamental feature. **Key Concept**: CTAS statements in Databricks automatically derive the schema from the SELECT query's result structure, eliminating the need for explicit schema declaration when creating tables from existing data.
Author: Keng Suppaseth
Ultimate access to all questions.
A table customerLocations exists with the following schema:
id STRING,
date STRING,
city STRING,
country STRING
id STRING,
date STRING,
city STRING,
country STRING
A senior data engineer wants to create a new table from this table using the following command:
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
A junior data engineer asks why the schema is not being declared for the new table.
Which of the following responses explains why declaring the schema is not necessary?
A
CREATE TABLE AS SELECT statements adopt schema details from the source table and query.
B
CREATE TABLE AS SELECT statements infer the schema by scanning the data.
No comments yet.