
Answer-first summary for fast verification
Answer: Change the distribution key to the table column that has the largest dimension.
In Amazon Redshift, if one compute node is experiencing significantly higher CPU utilization than others, it usually indicates data skew or an uneven distribution of workload. The distribution key determines how data is distributed across the nodes. Changing the distribution key to a column with high cardinality (largest dimension/unique values) ensures that the data is distributed evenly across all compute nodes, balancing the CPU load.
Author: Ritesh Yadav
Ultimate access to all questions.
Question 40
A data engineer notices that one of the nodes frequently has a CPU load over 90%. SQL queries that run on the node are queued. The other four nodes usually have a CPU load under 15% during daily operations. The data engineer wants to maintain the current number of compute nodes. The data engineer also wants to balance the load more evenly across all five compute nodes. Which solution will meet these requirements?
A
Change the sort key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.
B
Change the distribution key to the table column that has the largest dimension.
C
Upgrade the reserved node from ra3.4xlarge to ra3.16xlarge.
D
Change the primary key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.
No comments yet.