
Explanation:
To identify data skew in Table1 within Pool1, the correct approach is to connect to Pool1 and run DBCC PDW_SHOWSPACEUSED.
DBCC PDW_SHOWSPACEUSED is specifically designed for Azure Synapse Analytics dedicated SQL pools (formerly SQL Data Warehouse) to analyze data distribution across the 60 distributions that make up the MPP (Massively Parallel Processing) architecture. This command provides detailed information about:
When executed against Table1, this command reveals which distributions contain disproportionately large amounts of data, directly indicating the extent of data skew.
Option B (Connect to built-in pool and run DBCC PDW_SHOWSPACEUSED): This is incorrect because Azure Synapse Analytics dedicated SQL pools don't have a "built-in pool" concept for running diagnostic commands. The built-in pool refers to the serverless SQL pool, which cannot run DBCC commands against dedicated SQL pool objects.
Option C (Connect to Pool1 and run DBCC CHECKALLOC): This is incorrect because DBCC CHECKALLOC is primarily used for checking page allocation and database consistency in traditional SQL Server, not for identifying data skew in Azure Synapse Analytics' distributed architecture.
Option D (Connect to built-in pool and query sys.dm_pdw_sys_info): This is incorrect because sys.dm_pdw_sys_info provides system-level information about the entire dedicated SQL pool instance, not detailed data distribution metrics for specific tables. It doesn't offer the granular insight needed to identify data skew in individual tables.
Data skew is a critical performance consideration in Azure Synapse Analytics because uneven data distribution can lead to:
Running DBCC PDW_SHOWSPACEUSED directly against the target dedicated SQL pool provides the most accurate and actionable information for identifying and quantifying data skew issues.
Ultimate access to all questions.
No comments yet.
You have an Azure Synapse Analytics dedicated SQL pool named Pool1 that contains a fact table named Table1. You need to determine the extent of the data skew in Table1.
What should you run in Synapse Studio?
A
Connect to Pool1 and run DBCC PDW_SHOWSPACEUSED.
B
Connect to the built-in pool and run DBCC PDW_SHOWSPACEUSED.
C
Connect to Pool1 and run DBCC CHECKALLOC.
D
Connect to the built-in pool and query sys.dm_pdw_sys_info.