
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate, you are tasked with optimizing a data pipeline in Azure Data Factory that processes a dataflow reading from a large number of small files stored in Azure Blob Storage. The current performance is suboptimal due to the overhead of processing numerous small files. Considering the constraints of cost, compliance, and scalability, which of the following approaches would BEST improve the performance of the dataflow? Choose one option.
A
Increase the Data Flow's core count and memory allocation to enhance processing power, despite the potential increase in cost.
B
Combine the small files into larger files before processing, to reduce the number of file reads and minimize overhead.
C
Convert the files to a more efficient format like Parquet, without addressing the small file issue directly.
D
Implement a custom activity to pre-filter the data in the files before the dataflow processes them, adding complexity to the pipeline.