Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a big data processing scenario, you are tasked with optimizing the handling of a large number of small files in a distributed file system. What strategies would you employ to compact these small files and why?
A
Use a distributed file system that inherently compacts small files.
B
Implement a custom script to merge small files into larger ones during the data ingestion phase.
C
Leverage the file system's built-in compaction feature, if available, to periodically merge small files.
D
Ignore the small files and process them as they are, without any compaction.