
Ultimate access to all questions.
In your role as a data engineer, you are managing a Delta Lake table that has become inefficient due to a large number of small files, leading to performance degradation in your data processing pipeline. Considering the need for cost-effectiveness, compliance with data governance policies, and scalability, which of the following solutions BEST addresses this issue by leveraging Delta Lake's features? Choose the single best option.
A
Adjust the Delta Lake configuration to automatically increase the file size limit, thereby reducing the number of small files without manual intervention.
B
Utilize Delta Lake's OPTIMIZE command to reorganize and merge small files into larger, more efficient files, improving query performance and reducing overhead.
C
Manually delete the small files and then recreate the Delta table with a predefined larger file size to prevent the issue from recurring.
D
Disable Delta Lake's transaction logging feature to minimize the creation of small files, accepting the trade-off of losing transactional integrity and auditability.