Topic 1 Question 166
2 つ選択A data engineer configured an AWS Glue Data Catalog for data that is stored in Amazon S3 buckets. The data engineer needs to configure the Data Catalog to receive incremental updates.
The data engineer sets up event notifications for the S3 bucket and creates an Amazon Simple Queue Service (Amazon SQS) queue to receive the S3 events.
Which combination of steps should the data engineer take to meet these requirements with LEAST operational overhead?
Create an S3 event-based AWS Glue crawler to consume events from the SQS queue.
Define a time-based schedule to run the AWS Glue crawler, and perform incremental updates to the Data Catalog.
Use an AWS Lambda function to directly update the Data Catalog based on S3 events that the SQS queue receives.
Manually initiate the AWS Glue crawler to perform updates to the Data Catalog when there is a change in the S3 bucket.
Use AWS Step Functions to orchestrate the process of updating the Data Catalog based on S3 events that the SQS queue receives.
ユーザの投票
コメント(7)
- 正解だと思う選択肢: AB
Based on this article (Option 1 for the architecture) it should be AB:
- Run the crawler on a schedule.
- Crawler polls for object create events in the SQS queue 3a. If there are events, crawler updates the Data Catalog 3b. If not, crawler stops
👍 3pikuantne2024/10/31 - 正解だと思う選択肢: AC
Option A suggests creating an S3 event-based AWS Glue crawler to consume events from the SQS queue. This option is appropriate as it allows the crawler to automatically respond to events, thereby reducing manual intervention and ensuring timely updates to the Data Catalog
Option C involves using an AWS Lambda function to directly update the Data Catalog based on S3 events received from the SQS queue. This is a strong candidate as it automates the update process without the need for manual scheduling or intervention, thus minimizing operational overhead. AWS Glue Crawlers can consume events from an SQS queue: https://docs.aws.amazon.com/glue/latest/dg/crawler-s3-event-notifications.html
👍 3tucobbad2024/11/06 - 正解だと思う選択肢: AC
B and D are wrong due too "Manually" and "Scheduling". E is too much for this use case
👍 3michele_scar2024/11/15
シャッフルモード