Topic 1 Question 108

AWS Certified Data Engineer - Associate

Topic 1 Question 108
A data engineer needs to debug an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The data engineer enabled the bookmark feature for the AWS Glue job. The data engineer has set the maximum concurrency for the AWS Glue job to 1.

The AWS Glue job is successfully writing the output to Amazon Redshift. However, the Amazon S3 files that were loaded during previous runs of the AWS Glue job are being reprocessed by subsequent runs.

What is the likely reason the AWS Glue job is reprocessing the files?
- The AWS Glue job does not have the s3:GetObjectAcl permission that is required for bookmarks to work correctly.
- The maximum concurrency for the AWS Glue job is set to 1.
- The data engineer incorrectly specified an older version of AWS Glue for the Glue job.
- The AWS Glue job does not have a required commit statement.
ユーザの投票
コメント(13)
- 正解だと思う選択肢: D
  https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#error-job-bookmarks-reprocess-data
  
  👍 8
  lool2024/07/06
- D is good
  
  https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#error-job-bookmarks-reprocess-data
  
  👍 4
  Bmaster2024/06/28
- 正解だと思う選択肢: A
  For AWS Glue bookmarks to function correctly, the job needs the necessary permissions to read and write bookmark data, including the s3:GetObjectAcl permission. If these permissions are not correctly set, the job may not be able to track which files have already been processed, leading to reprocessing of previously processed files.
  
  👍 4
  antun3ra2024/08/08
シャッフルモード

ユーザの投票

コメント(13)