Topic 1 Question 56

AWS Certified Data Engineer - Associate

Topic 1 Question 56
A security company stores IoT data that is in JSON format in an Amazon S3 bucket. The data structure can change when the company upgrades the IoT devices. The company wants to create a data catalog that includes the IoT data. The company's analytics department will use the data catalog to index the data. Which solution will meet these requirements MOST cost-effectively?
- Create an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create a new AWS Glue workload to orchestrate the ingestion of the data that the analytics department will use into Amazon Redshift Serverless.
- Create an Amazon Redshift provisioned cluster. Create an Amazon Redshift Spectrum database for the analytics department to explore the data that is in Amazon S3. Create Redshift stored procedures to load the data into Amazon Redshift.
- Create an Amazon Athena workgroup. Explore the data that is in Amazon S3 by using Apache Spark through Athena. Provide the Athena workgroup schema and tables to the analytics department.
- Create an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create AWS Lambda user defined functions (UDFs) by using the Amazon Redshift Data API. Create an AWS Step Functions job to orchestrate the ingestion of the data that the analytics department will use into Amazon Redshift Serverless.
ユーザの投票
コメント(7)
- 正解だと思う選択肢: A
  Option A, creating an AWS Glue Data Catalog with Glue Schema Registry and orchestrating data ingestion into Amazon Redshift Serverless using AWS Glue, appears to be the most cost-effective and suitable solution. It offers a serverless approach to manage the evolving data schema of the IoT data and efficiently supports data analytics needs without the overhead of managing a provisioned database cluster or complex orchestration setups.
  
  👍 9
  rralucard_2024/02/02
- 正解だと思う選択肢: A
  The objective is to create a data catalog that includes the IoT data and AWS Glue Data Catalog is the best option for this requirement. https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html
  
  C is incorrect. While Athena makes it easy to read from S3 using SQL, it does not crawl the data source and create a data catalog.
  
  👍 4
  chris_spencer2024/04/17
- 正解だと思う選択肢: C
  Options A, B, and D involve setting up additional infrastructure (e.g., AWS Glue, Redshift clusters, Lambda functions) which may incur unnecessary costs and complexity for the given requirements. Option C, on the other hand, utilizes a serverless and scalable solution directly querying data in S3, making it the most cost-effective choice.
  
  👍 2
  lucas_rfsb2024/04/01
シャッフルモード

ユーザの投票

コメント(7)