
Answer-first summary for fast verification
Answer: TransactionMonth
## Detailed Explanation ### Understanding Partitioning in Azure Synapse Analytics Partitioning in Azure Synapse Analytics dedicated SQL pools is a technique used to divide large tables into smaller, more manageable pieces called partitions. This improves query performance by allowing the query engine to eliminate entire partitions from scanning when they don't contain relevant data. ### Analysis of Query Requirements The key requirements are: - **Primary filtering**: Analysts will most commonly filter transactions for a specific month - **Secondary analysis**: Summarization by transaction type, customer segment, and/or account type ### Evaluation of Partitioning Options #### **TransactionMonth (Option D) - RECOMMENDED** - **65 million rows per month** provides optimal partition sizing - Directly aligns with the most common query pattern (filtering by month) - Enables partition elimination for monthly queries, significantly reducing data scanned - Each partition contains sufficient data for clustered columnstore index efficiency (well above the 1 million row minimum) - Monthly partitioning naturally organizes data chronologically, which is a common best practice #### **AccountType (Option B)** - **500 million rows per account type** creates overly large partitions - Does not align with the primary query pattern (monthly filtering) - Large partitions reduce the benefits of partition elimination - May lead to inefficient resource utilization #### **TransactionType (Option C)** - **40 million rows per transaction type** is reasonable for partition sizing - However, doesn't directly support the primary monthly filtering requirement - Less effective for the most common query pattern #### **CustomerSegment (Option A)** - **4 million rows per customer segment** creates too many small partitions - Small partitions can lead to inefficient columnstore compression - Does not support the primary monthly query pattern ### Why TransactionMonth is Optimal 1. **Query Pattern Alignment**: Since analysts primarily filter by month, partitioning on TransactionMonth enables efficient partition elimination, where the query engine can skip entire partitions that don't contain the requested month's data. 2. **Optimal Partition Size**: With 65 million rows per month, each partition contains sufficient data for efficient columnstore compression while remaining manageable in size. 3. **Best Practices Compliance**: Microsoft recommends partitioning on date columns when queries commonly filter by date ranges, as this naturally aligns with data loading and query patterns. 4. **Performance Benefits**: Monthly partitioning allows for: - Faster data loading and maintenance operations - Efficient archiving of older data - Optimized query performance for time-based analysis ### Clarification on Parallel Processing Some confusion exists regarding parallel processing and "hot partitions." In Azure Synapse Analytics: - Partitioning and distribution are separate concepts - Each partition is automatically distributed across 60 distributions - Partitioning by month does not create "hot partitions" that prevent parallel processing - The distribution mechanism ensures parallel processing regardless of partitioning strategy ### Conclusion TransactionMonth is the optimal partitioning column because it directly supports the primary query pattern, provides appropriate partition sizing, and follows Azure Synapse Analytics best practices for time-based data analysis scenarios.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
You are designing a financial transactions table with a clustered columnstore index in an Azure Synapse Analytics dedicated SQL pool. The table has the following columns and estimated rows per value:
The query requirements are:
You need to recommend a partitioning column for the table to minimize query times. Which column should you choose?

A
CustomerSegment
B
AccountType
C
TransactionType
D
TransactionMonth