Ultimate access to all questions.
You are designing a fact table named FactPurchase
in an Azure Synapse Analytics dedicated SQL pool for a retail store. The table will contain purchase data from suppliers with the following columns:
PurchaseKey
SupplierKey
StockItemKey
DateKey
IsOrderFinalized
The table will have 1 million rows added daily and will store three years of data. Daily queries will be executed that are similar to the following:
SELECT
SupplierKey,
StockItemKey,
IsOrderFinalized,
COUNT(*)
FROM FactPurchase
WHERE DateKey >= 20210101
AND DateKey <= 20210131
GROUP BY
SupplierKey,
StockItemKey,
IsOrderFinalized
Which table distribution type will minimize query times?