
Answer-first summary for fast verification
Answer: once per year
## Detailed Analysis ### Understanding the Scenario - **Table1** has 1 billion existing rows with a clustered columnstore index - The table is hash-distributed on Product Key across 60 distributions (default in Azure Synapse Analytics) - 30 million new rows are added monthly - Partitioning is required on Sales Date column to optimize query performance and data loading ### Key Considerations for Partitioning in Azure Synapse Analytics #### 1. Minimum Row Requirements for Clustered Columnstore Index - Microsoft recommends **at least 1 million rows per distribution and partition** for optimal compression and performance - With 60 distributions, this translates to **60 million rows per partition** for optimal performance #### 2. Data Distribution Analysis - **Monthly partitioning (Option A)**: 30 million rows ÷ 60 distributions = 500,000 rows per distribution - This falls **below** the recommended 1 million rows per distribution - Results in poor columnstore compression and suboptimal query performance - **Yearly partitioning (Option B)**: 360 million rows ÷ 60 distributions = 6 million rows per distribution - **Exceeds** the minimum requirement significantly - Provides excellent compression and query performance - **Daily partitioning (Option C)**: ~1 million rows ÷ 60 distributions = ~16,667 rows per distribution - Far below minimum requirements - Would create excessive partitions with poor performance - **Weekly partitioning (Option D)**: ~7.5 million rows ÷ 60 distributions = ~125,000 rows per distribution - Still below the 1 million row threshold #### 3. Query Performance Considerations - Yearly partitions (Option B) provide: - Better columnstore compression ratios - More efficient segment elimination during queries - Reduced metadata overhead - Faster query execution for time-based filtering #### 4. Data Loading Optimization - While monthly partitioning might seem intuitive for monthly data loads, the performance penalty from insufficient rows per distribution outweighs any loading benefits - Azure Synapse Analytics can efficiently handle large partition loads - The existing 1 billion rows already provide a solid foundation for yearly partitioning ### Why Option B is Optimal - **Meets minimum row requirements**: 6 million rows per distribution exceeds the 1 million threshold - **Optimal compression**: Larger row counts per partition enable better columnstore compression - **Query performance**: Fewer, larger partitions reduce metadata overhead and improve segment elimination - **Future scalability**: Accommodates growth while maintaining performance standards ### Why Other Options Are Less Suitable - **Option A (Monthly)**: Insufficient rows per distribution (500K < 1M minimum) - **Option C (Daily)**: Extremely granular partitioning with very poor performance - **Option D (Weekly)**: Still below minimum row requirements (125K < 1M) ### Best Practice Alignment Microsoft's documentation clearly states that for clustered columnstore tables in dedicated SQL pools, partitions should contain sufficient data to achieve at least 1 million rows per distribution. Yearly partitioning with 360 million total rows (6 million per distribution) aligns perfectly with this guidance while optimizing both query performance and data loading efficiency.
Ultimate access to all questions.
Author: LeetQuiz Editorial Team
You have an Azure Synapse Analytics dedicated SQL pool with a table named Table1. The table has the following properties:
Thirty million new rows are added to Table1 monthly. You plan to partition Table1 on the Sales Date column to optimize both query performance and data loading.
How often should you create a partition?
A
once per month
B
once per year
C
once per day
D
once per week
No comments yet.