
Explanation:
Delta Lake automatically captures statistics in the transaction log for each added data file of the table. By default, it collects statistics on the first 32 columns of each table. These statistics include the total number of records, minimum and maximum values in each column, and null value counts for each column within the first 32 columns. These statistics are utilized for data skipping based on query filters. Reference: Delta Lake Documentation.
Ultimate access to all questions.
No comments yet.
Which of the following statements accurately describes Delta Lake's default behavior regarding file statistics?
A
Delta Lake, by default, captures statistics in the Hive metastore for the first 16 columns of each table.
B
Delta Lake, by default, captures statistics in the transaction log for the first 32 columns of each table.
C
Delta Lake, by default, captures statistics in both the Hive metastore and transaction log for each added data file.
D
Delta Lake, by default, captures statistics in the transaction log for the first 16 columns of each table.
E
Delta Lake, by default, captures statistics in the Hive metastore for the first 32 columns of each table.