
Ultimate access to all questions.
A data science team is experiencing slow query performance when searching for specific keywords within a high-cardinality review column (STRING) stored in a Parquet table. A junior data engineer suggests migrating the table to Delta Lake to resolve the performance bottleneck.
How should you respond to this suggestion?
A
Delta Lake statistics are ineffective for high-cardinality free-text fields like reviews, providing limited performance benefits for keyword-based substring searches.
B
Delta Lake tables do not support the STRING data type, which would require the team to restructure the review data into a different format.
C
While Delta Lake offers performance benefits, significant gains for keyword searches would only be achieved by applying Z-ORDER optimization to the review column.
D
The Delta Lake transaction log automatically generates a structured index for text fields, enabling efficient keyword filtering without full table scans.
E
Delta Lake only collects metadata statistics for the first four columns in a table schema, making it ineffective for the review column in this specific table.