Topic 1 Question 281
A company is collecting a large amount of data from a fleet of IoT devices. Data is stored as Optimized Row Columnar (ORC) files in the Hadoop Distributed File System (HDFS) on a persistent Amazon EMR cluster. The company's data analytics team queries the data by using SQL in Apache Presto deployed on the same EMR cluster. Queries scan large amounts of data, always run for less than 15 minutes, and run only between 5 PM and 10 PM.
The company is concerned about the high cost associated with the current solution. A solutions architect must propose the most cost-effective solution that will allow SQL data queries.
Which solution will meet these requirements?
Store data in Amazon S3. Use Amazon Redshift Spectrum to query data.
Store data in Amazon S3. Use the AWS Glue Data Catalog and Amazon Athena to query data.
Store data in EMR File System (EMRFS). Use Presto in Amazon EMR to query data.
Store data in Amazon Redshift. Use Amazon Redshift to query data.
ユーザの投票
コメント(12)
- 正解だと思う選択肢: B
Storing the data in Amazon S3 is a cost-effective solution compared to running a persistent EMR cluster with HDFS. The AWS Glue Data Catalog provides a centralized metadata repository for organizing and cataloging data in S3. Amazon Athena is a serverless query service that allows you to run SQL queries directly against data in S3 without the need for a dedicated cluster or infrastructure. By using Amazon Athena, you only pay for the queries you run, which aligns with the requirement of cost-effectiveness.
👍 3Alabi2023/06/26 - 正解だと思う選択肢: B
Clasic ServerLess S3 Datalake Glue for ETL Athena for Query
👍 3SkyZeroZx2023/07/02 - 正解だと思う選択肢: B
it's a B
👍 2NikkyDicky2023/07/07
シャッフルモード