Topic 1 Question 13

AWS Certified Data Engineer - Associate

Topic 1 Question 13
A company stores daily records of the financial performance of investment portfolios in .csv format in an Amazon S3 bucket. A data engineer uses AWS Glue crawlers to crawl the S3 data. The data engineer must make the S3 data accessible daily in the AWS Glue Data Catalog. Which solution will meet these requirements?
- Create an IAM role that includes the AmazonS3FullAccess policy. Associate the role with the crawler. Specify the S3 bucket path of the source data as the crawler's data store. Create a daily schedule to run the crawler. Configure the output destination to a new path in the existing S3 bucket.
- Create an IAM role that includes the AWSGlueServiceRole policy. Associate the role with the crawler. Specify the S3 bucket path of the source data as the crawler's data store. Create a daily schedule to run the crawler. Specify a database name for the output.
- Create an IAM role that includes the AmazonS3FullAccess policy. Associate the role with the crawler. Specify the S3 bucket path of the source data as the crawler's data store. Allocate data processing units (DPUs) to run the crawler every day. Specify a database name for the output.
- Create an IAM role that includes the AWSGlueServiceRole policy. Associate the role with the crawler. Specify the S3 bucket path of the source data as the crawler's data store. Allocate data processing units (DPUs) to run the crawler every day. Configure the output destination to a new path in the existing S3 bucket.
ユーザの投票
コメント(5)
- B. Create an IAM role that includes the AWSGlueServiceRole policy. Associate the role with the crawler. Specify the S3 bucket path of the source data as the crawler's data store. Create a daily schedule to run the crawler. Specify a database name for the output.
  
  Explanation: Option B correctly sets up the IAM role with the necessary permissions using the AWSGlueServiceRole policy, which is designed for use with AWS Glue. It specifies the S3 bucket path of the source data as the crawler's data store and creates a daily schedule to run the crawler. Additionally, it specifies a database name for the output, ensuring that the crawled data is properly cataloged in the AWS Glue Data Catalog.
  
  👍 8
  TonyStark01222024/09/20
- 正解だと思う選択肢: B
  A,C are wrong because you use don't need full S3 access. D is wrong because you don't need to provision DPU and the destination should be a database, not an s3 bucket. so it's B
  
  👍 4
  GiorgioGss2024/03/07
- 正解だと思う選択肢: B
  Glue Crawlers are serverless. Assigning DPUs is the point where i decided it option B
  
  👍 4
  k350Secops2024/05/09
シャッフルモード

ユーザの投票

コメント(5)