
Explanation:
In Azure Synapse Analytics dedicated SQL pool, when working with clustered columnstore indexes, the optimal approach for partitioning is based on the fundamental architecture of how data is distributed and compressed.
Default Distributions: Dedicated SQL pool automatically divides each table into 60 distributions by default, regardless of partitioning. This is a fundamental architectural characteristic of the MPP (Massively Parallel Processing) system.
Optimal Row Count per Distribution/Partition: For clustered columnstore tables to achieve optimal compression and query performance, Microsoft recommends having at least 1 million rows per distribution and partition.
Partitioning Strategy: When considering additional partitions beyond the default distributions, the total table size must support having sufficient rows in each resulting segment.
This guidance aligns with Microsoft's best practices for maintaining optimal columnstore compression and query performance in large-scale data warehousing scenarios.
Ultimate access to all questions.
You have an Azure Synapse Analytics dedicated SQL pool with a fact table named Table1 that will use a clustered columnstore index. To optimize data compression and query performance, what is the minimum number of rows Table1 should have before creating partitions?
A
100,000
B
600,000
C
1 million
D
60 million
No comments yet.