Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Which of the following describes a benefit of creating an external table from Parquet rather than CSV when using a CREATE TABLE AS SELECT statement?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:01

Parquet files can be partitioned

CREATE TABLE AS SELECT statements cannot be used on files

Parquet files have a well-defined schema

Parquet files have the ability to be optimized

Parquet files will become Delta tables

Explanation:

Parquet files have a well-defined schema that is embedded within the file format itself, unlike CSV files which are schema-less and require schema inference or explicit schema definition. This is a key benefit when using CREATE TABLE AS SELECT (CTAS) statements because:

Schema Preservation: Parquet files store schema metadata (column names, data types) directly in the file, ensuring data integrity and consistency.
No Schema Inference Issues: With CSV files, Spark must infer the schema by scanning the data, which can lead to errors (e.g., incorrect data type detection, null value handling issues).
Better Performance: Parquet's columnar format with embedded schema allows for more efficient data processing and querying.
Data Type Support: Parquet supports complex data types (arrays, structs, maps) that CSV cannot natively represent.

While options A and D are also true about Parquet files (they can be partitioned and optimized), the most direct benefit specifically for CTAS operations is the well-defined schema, which eliminates schema-related issues that commonly occur with CSV files.

Powered ByGPT-5.2

Comments

Loading comments...