Topic 1 Question 55
A company uses an Amazon Redshift provisioned cluster as its database. The Redshift cluster has five reserved ra3.4xlarge nodes and uses key distribution. A data engineer notices that one of the nodes frequently has a CPU load over 90%. SQL Queries that run on the node are queued. The other four nodes usually have a CPU load under 15% during daily operations. The data engineer wants to maintain the current number of compute nodes. The data engineer also wants to balance the load more evenly across all five compute nodes. Which solution will meet these requirements?
Change the sort key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.
Change the distribution key to the table column that has the largest dimension.
Upgrade the reserved node from ra3.4xlarge to ra3.16xlarge.
Change the primary key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.
ユーザの投票
コメント(5)
- 正解だと思う選択肢: B
https://docs.aws.amazon.com/redshift/latest/dg/t_Distributing_data.html Option B, changing the distribution key, is the most effective solution to balance the load more evenly across all five compute nodes. Selecting an appropriate distribution key that aligns with the query patterns and data characteristics can result in a more uniform distribution of data and workloads, thus reducing the likelihood of one node being overutilized while others are underutilized.
👍 7rralucard_2024/08/02 B. With "Key distribution". The rows are distributed according to the values in one column. The leader node places matching values on the same node slice. If you distribute a pair of tables on the joining keys, the leader node collocates the rows on the slices according to the values in the joining columns. This way, matching values from the common columns are physically stored together. https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html
👍 2damaldon2024/09/06- 正解だと思う選択肢: B
The correct solution is B. Change the distribution key to the table column that has the largest dimension. This will help to distribute the data more evenly across the nodes, reducing the load on the heavily utilized node.
👍 2khchan1232024/10/28
シャッフルモード