Topic 1 Question 52
A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements: ✑ Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum. ✑ Support event-driven ETL pipelines ✑ Provide a quick and easy way to understand metadata Which approach meets these requirements?
Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Batch job, and an external Apache Hive metastore to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Batch job, and an AWS Glue Data Catalog to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Glue ETL job, and an external Apache Hive metastore to search and discover metadata.
ユーザの投票
コメント(10)
BOTH A AND B ARE ANSWERS.
BUT external Apache Hive MIGHT BE NOT SERVERLESS SOLUTION.
The AWS Glue Data Catalog is your persistent metadata store. It is a managed service that lets you store, annotate, and share metadata in the AWS Cloud in the same way you would in an Apache Hive metastore. The Data Catalog is a drop-in replacement for the Apache Hive Metastore
https://docs.aws.amazon.com/zh_tw/glue/latest/dg/components-overview.html
BEAUTIFUL ANSWER IS A.
👍 42DonaldCMLIN2021/09/21Answer is A. Lamda is the preferred way of implementing event-driven ETL job with S3, when new data arrives in S3, it notifies lamda which can start the ETL job.
👍 17cybe0012021/10/04A is correct...Lambda is event driven and Glue is serverless as opposed to Hive
👍 4PRC2021/10/07
シャッフルモード