
Answer-first summary for fast verification
Answer: 40
## Analysis of Partition Strategy for Azure Synapse Analytics ### Understanding the Architecture Azure Synapse Analytics dedicated SQL pool uses a Massively Parallel Processing (MPP) architecture with **60 distributions** by default. When data is hash-distributed on ProductID across 20,000 products, the data is automatically spread across these 60 distributions. ### Key Performance Considerations For optimal clustered columnstore index (CCI) performance in Azure Synapse Analytics: - Each **columnstore segment** should contain approximately **1 million rows** for efficient compression and query performance - The 60 distributions operate independently, so we need to consider the data distribution across them - Too many partitions can lead to small row groups that reduce compression efficiency - Too few partitions can limit partition elimination benefits ### Calculation Breakdown 1. **Total Records**: 2.4 billion rows 2. **Distributions**: 60 (automatic in dedicated SQL pool) 3. **Rows per Distribution**: 2.4 billion ÷ 60 = 40 million rows per distribution 4. **Optimal Rows per Partition**: 1 million rows (for efficient columnstore compression) 5. **Partitions per Distribution**: 40 million ÷ 1 million = 40 partitions per distribution ### Why 40 Partitions is Optimal - **40 partitions** ensures each partition within each distribution contains approximately 1 million rows - This aligns with Microsoft's best practice of having 1+ million rows per columnstore segment - Provides good balance between partition elimination benefits and compression efficiency - Avoids the overhead of too many small partitions while maintaining query performance ### Analysis of Other Options - **B: 240 partitions** - This would result in only 10 million rows per partition across the entire table, but when distributed across 60 distributions, each partition would have only ~167,000 rows per distribution, which is below the optimal 1 million threshold - **C: 400 partitions** - Results in even smaller partitions (~100,000 rows per distribution), further reducing compression efficiency - **D: 2,400 partitions** - Creates extremely small partitions (~16,667 rows per distribution), which would severely impact columnstore compression and query performance The key insight is that partitions are applied **across** the 60 distributions, not within them. Therefore, the optimal number of partitions should ensure that each partition within each distribution contains approximately 1 million rows.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications:
Which number of partition ranges provides optimal compression and performance for the clustered columnstore index?

A
40
B
240
C
400
D
2,400
No comments yet.