Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data engineer has realized that the data files associated with a Delta table are incredibly small. They want to compact the small files to form larger files to improve performance. Which of the following keywords can be used to compact the small files?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:03

REDUCE

OPTIMIZE

COMPACTION

REPARTITION

VACUUM

Explanation:

Explanation

In Databricks Delta Lake, the OPTIMIZE command is used to compact small files into larger files to improve query performance. Here's why:

OPTIMIZE: This command performs file compaction (also known as bin-packing) on Delta tables. It merges small files into larger ones, which improves read performance by reducing the number of files that need to be read during queries.
Why other options are incorrect:
- REDUCE: Not a valid Delta Lake command for file compaction
- COMPACTION: While conceptually related, this is not the actual command name in Delta Lake
- REPARTITION: This is a Spark transformation that redistributes data across partitions, but it doesn't specifically compact existing small files in a Delta table
- VACUUM: This command removes old files that are no longer referenced by the Delta table (files older than the retention period), but it doesn't compact small files
How to use OPTIMIZE:

OPTIMIZE table_name

OPTIMIZE table_name

Or with Z-ordering:

OPTIMIZE table_name
ZORDER BY column_name

OPTIMIZE table_name
ZORDER BY column_name

Benefits:
- Reduces the number of files to read
- Improves query performance
- Can be combined with Z-ordering for better data skipping

The OPTIMIZE command is specifically designed for this purpose in Delta Lake, making it the correct choice for compacting small files to improve performance.

Powered ByGPT-5.2

Comments

Loading comments...