
Ultimate access to all questions.
Question 16
A table customerLocations exists with the following schema:
id STRING,
date STRING,
city STRING,
country STRING
id STRING,
date STRING,
city STRING,
country STRING
A senior data engineer wants to create a new table from this table using the following command:
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
CREATE TABLE customersPerCountry AS
SELECT country,
COUNT(*) AS customers
FROM customerLocations
GROUP BY country;
A junior data engineer asks why the schema is not being declared for the new table.
Which of the following responses explains why declaring the schema is not necessary?*
Explanation:
In Databricks and Spark SQL, the CREATE TABLE AS SELECT (CTAS) statement automatically adopts the schema from the source table and the query results. Here's why:
Option A is correct: CTAS statements inherit the schema definition from the SELECT query's result set. The new table customersPerCountry will have:
country column with STRING type (inherited from the source table)customers column with BIGINT type (result of COUNT(*))Option B is incorrect: Schema inference by scanning data is not how CTAS works. The schema is determined at query compilation time, not by scanning actual data.
Option C is incorrect: Schemas are not optional in Databricks tables - all tables have defined schemas.
Option D is incorrect: CTAS does not default all columns to STRING type. It preserves the actual data types from the query result.
Option E is incorrect: All tables in Databricks support schemas and have defined schemas.
The CTAS statement automatically determines the schema based on:
country from customerLocations)COUNT(*) returns BIGINT)This eliminates the need for manual schema declaration when creating tables from existing data.