Databricks Certified Data Engineer - Associate

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

What is a benefit of creating an external table from Parquet rather than CSV when using a CREATE TABLE AS SELECT statement?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:03

Parquet files can be partitioned

Parquet files will become Delta tables

Parquet files have a well-defined schema

Parquet files have the ability to be optimized

Explanation:

Explanation

When comparing Parquet vs CSV for external tables in Databricks:

Parquet advantages:

Schema enforcement: Parquet files have a well-defined schema embedded within the file format itself, which includes data types, column names, and metadata.
Columnar storage: Parquet is a columnar format optimized for analytics workloads.
Compression: Better compression ratios compared to CSV.
Schema evolution: Supports schema evolution capabilities.

CSV limitations:

No embedded schema: CSV files don't contain schema information - schema must be inferred or explicitly defined.
Type inference issues: Databricks must infer data types from CSV content, which can lead to errors.
No compression: Typically larger file sizes.
Parsing overhead: More expensive to parse during query execution.

Why other options are incorrect:

A: Both Parquet and CSV files can be partitioned in Databricks.
B: External tables from Parquet files don't automatically become Delta tables; they remain external tables pointing to Parquet files.
D: While Parquet files can be optimized, this is not the primary benefit over CSV for external tables.

The key benefit is that Parquet's embedded schema eliminates schema inference issues and provides better type safety compared to CSV.

Powered ByGPT-5.2

Loading comments...