
Answer-first summary for fast verification
Answer: Utilizing a star schema with dimension tables for user attributes and a fact table for behavior events
In a lakehouse architecture, a star schema is optimal for query performance and scalability when dealing with user behavior data across many dimensions. This approach involves a central fact table for behavior events surrounded by dimension tables for user attributes, minimizing joins and enhancing query efficiency. The star schema's structure supports easy addition of new dimensions without disrupting existing models, ensuring scalability. This method contrasts with alternatives like wide tables, which may not scale as efficiently, or snowflake schemas, which introduce complexity through normalization. Pre-aggregated tables, while useful for specific queries, lack the flexibility for comprehensive analysis across all dimensions.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When analyzing user behavior data with potentially hundreds of dimensions in a lakehouse architecture, which data modeling strategy would best optimize for query performance and scalability?
A
Storing user events in a single wide table with a column for each dimension to minimize join operations
B
Creating a pre-aggregated summary table that updates daily with the most common analytics queries
C
Utilizing a star schema with dimension tables for user attributes and a fact table for behavior events
D
Implementing a snowflake schema where user behavior events are normalized into multiple related tables
No comments yet.