
Answer-first summary for fast verification
Answer: Design a single bronze table with a unified schema capable of accommodating all streaming data sources and formats, leveraging Delta Lake's capabilities for efficient data processing and consistency.
The BEST approach to avoid common pitfalls when productionalizing streaming workloads is to design a single bronze table with a unified schema that can accommodate all streaming data sources and formats. This approach leverages Delta Lake's capabilities for efficient data ingestion, processing, and ensures data consistency and integrity. It simplifies the architecture by reducing the complexity of managing multiple tables and orchestration layers, while also being cost-effective and scalable. Key steps include identifying data sources and formats, defining a unified schema, implementing ingestion pipelines, applying necessary transformations, and utilizing Delta Lake's write operations for data persistence.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
In the context of designing a multiplex bronze table for productionalizing streaming workloads on Azure Databricks, consider the following scenario: Your organization is ingesting streaming data from multiple sources, each with different formats and schemas. The goal is to ensure efficient data ingestion, processing, and maintain data consistency and integrity while adhering to cost constraints and scalability requirements. Which of the following approaches is the BEST to avoid common pitfalls in this scenario? (Choose one option)
A
Ingest all streaming data into a single bronze table without considering the source or data format, relying on post-ingestion transformations to handle discrepancies.
B
Create separate bronze tables for each streaming data source, each with a unique schema, and manage them independently to ensure data isolation.
C
Design a single bronze table with a unified schema capable of accommodating all streaming data sources and formats, leveraging Delta Lake's capabilities for efficient data processing and consistency.
D
Implement multiple bronze tables, each tailored to a specific data source or format, and use a complex orchestration layer to manage data flow and transformations between them.