Topic 1 Question 140

AWS Certified Data Engineer - Associate

Topic 1 Question 140
A company receives test results from testing facilities that are located around the world. The company stores the test results in millions of 1 KB JSON files in an Amazon S3 bucket. A data engineer needs to process the files, convert them into Apache Parquet format, and load them into Amazon Redshift tables. The data engineer uses AWS Glue to process the files, AWS Step Functions to orchestrate the processes, and Amazon EventBridge to schedule jobs.

The company recently added more testing facilities. The time required to process files is increasing. The data engineer must reduce the data processing time.

Which solution will MOST reduce the data processing time?
- Use AWS Lambda to group the raw input files into larger files. Write the larger files back to Amazon S3. Use AWS Glue to process the files. Load the files into the Amazon Redshift tables.
- Use the AWS Glue dynamic frame file-grouping option to ingest the raw input files. Process the files. Load the files into the Amazon Redshift tables.
- Use the Amazon Redshift COPY command to move the raw input files from Amazon S3 directly into the Amazon Redshift tables. Process the files in Amazon Redshift.
- Use Amazon EMR instead of AWS Glue to group the raw input files. Process the files in Amazon EMR. Load the files into the Amazon Redshift tables.
ユーザの投票
コメント(3)
- 正解だと思う選択肢: B
  Option B: Use the AWS Glue dynamic frame file-grouping option to ingest the raw input files. Process the files. Load the files into the Amazon Redshift tables.
  
  👍 1
  matt2002024/08/14
- 正解だと思う選択肢: B
  Answer is B
  
  👍 1
  aragon_saa2024/08/14
- 正解だと思う選択肢: B
  The key requirement is to reduce processing time for millions of small JSON files stored in Amazon S3. The solution needs to address the inefficiencies caused by the large number of small files while leveraging the existing AWS Glue and Amazon Redshift setup.
  
  👍 1
  minhhnh2025/01/04
シャッフルモード

ユーザの投票

コメント(3)