Examtopics

AWS Certified Data Engineer - Associate

132
135

Topic 1 Question 133
A company is using Amazon Redshift to build a data warehouse solution. The company is loading hundreds of files into a fact table that is in a Redshift cluster.

The company wants the data warehouse solution to achieve the greatest possible throughput. The solution must use cluster resources optimally when the company loads data into the fact table.

Which solution will meet these requirements?
- Use multiple COPY commands to load the data into the Redshift cluster.
- Use S3DistCp to load multiple files into Hadoop Distributed File System (HDFS). Use an HDFS connector to ingest the data into the Redshift cluster.
- Use a number of INSERT statements equal to the number of Redshift cluster nodes. Load the data in parallel into each node.
- Use a single COPY command to load the data into the Redshift cluster.
ユーザの投票
コメント(5)
- D? https://docs.aws.amazon.com/redshift/latest/dg/t_Loading-data-from-S3.html
  
  👍 5
  canace2024/08/04
- 正解だと思う選択肢: D
  A single COPY command automatically parallelizes the load operation across all nodes in the Redshift cluster. This ensures optimal use of cluster resources.
  
  👍 3
  antun3ra2024/08/07
- 正解だと思う選択肢: D
  Agree with canace; Redshift's copy command uses MPP architecture to read and load in parallel from files into DWH.
  
  👍 2
  Shanmahi2024/08/05
シャッフルモード

132
135