
Answer-first summary for fast verification
Answer: Design an ETL pipeline that ingests data in its native format, performs minimal transformations, and stores it in a data lake for further processing and analysis.
Option B is the correct answer. Designing an ETL pipeline that ingests data in its native format and performs minimal transformations allows for flexibility and scalability in processing diverse data types. Storing the data in a data lake enables further processing and analysis as needed. Using a traditional data warehouse or performing extensive data cleansing before ingestion may not be suitable for a data lake architecture. A single-stage ETL process without considering data governance and security may lead to data quality and compliance issues.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your company is planning to implement a data lake architecture to store and process large volumes of diverse data. Describe the steps you would take to design and implement an ETL pipeline for a data lake, and explain the considerations involved in handling the data in its native format.
A
Use a traditional data warehouse architecture and store the data in a structured format, as it is more suitable for handling large volumes of diverse data.
B
Design an ETL pipeline that ingests data in its native format, performs minimal transformations, and stores it in a data lake for further processing and analysis.
C
Perform extensive data cleansing and transformation before ingesting the data into the data lake, to ensure data quality and consistency.
D
Use a single-stage ETL process to load all data into the data lake and perform all transformations and analysis there, without considering data governance and security.
No comments yet.