
Answer-first summary for fast verification
Answer: Implement a star schema design with dimension tables for time, geography, product, and customer demographics, centering around a sales fact table.
The star schema design is ideal for multi-dimensional analytics, enabling efficient querying across different dimensions. It features a central fact table (sales) surrounded by dimension tables (time, geography, product, customer demographics), which simplifies queries and enhances performance by reducing join operations. This design also supports scalability and flexibility, allowing for the addition of new dimensions without disrupting existing data or queries. Moreover, it simplifies the data model, making it easier to understand, maintain, and govern, ensuring data integrity. Therefore, the star schema design is the most effective approach for optimizing query performance in a Databricks lakehouse environment for multi-dimensional analytics.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a data model in a Databricks lakehouse to support multi-dimensional analytics for a retail company. How would you structure the data to optimize for query performance across numerous dimensions such as time, geography, product, and customer demographics?
A
Store all data in a single, flattened wide table to minimize join operations during query execution.
B
Normalize the data into multiple related tables to reduce redundancy and save on storage costs.
C
Implement a star schema design with dimension tables for time, geography, product, and customer demographics, centering around a sales fact table.
D
Use a snowflake schema but avoid creating dimension tables for highly hierarchical dimensions like geography.
No comments yet.