
Answer-first summary for fast verification
Answer: To act as the initial storage layer for raw, unprocessed data collected from various sources, ensuring scalability and cost-efficiency.
Option B is the correct answer because the 'source' component in a DLT pipeline is designed to store raw, unprocessed data as it is ingested from various sources. This aligns with the company's requirements for cost-effectiveness and scalability, as raw data storage solutions like Azure Data Lake Storage Gen2 are optimized for these needs. Option A is incorrect because the final storage for processed data is typically the 'sink' or 'target' component, not the source. Option C is incorrect because data transformation and processing are functions of the 'processing' component within the pipeline, not the source. Option D is incorrect because visualization and exploration are end-user activities that occur after data processing, not at the source level.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In the context of designing a Data Lake and Analytics (DLT) pipeline on Microsoft Azure, consider a scenario where a company needs to ingest large volumes of raw data from various sources for future processing and analysis. The company emphasizes cost-effectiveness, scalability, and compliance with data governance policies. Given these requirements, what is the primary purpose of the 'source' component in the DLT pipeline? Choose the best option from the following:
A
To serve as the final storage for processed and analyzed data, enabling direct querying and reporting.
B
To act as the initial storage layer for raw, unprocessed data collected from various sources, ensuring scalability and cost-efficiency.
C
To transform and process the raw data into a structured format suitable for analytics and visualization tools.
D
To provide a user interface for data visualization and exploration, facilitating insights generation.
No comments yet.