Topic 1 Question 26
2 つ選択A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must follow best practices for running cost-optimized and long-running workloads on Amazon EMR. The team must find a solution that will maintain the company's current level of performance. Which combination of resources will meet these requirements MOST cost-effectively?
Use Hadoop Distributed File System (HDFS) as a persistent data store.
Use Amazon S3 as a persistent data store.
Use x86-based instances for core nodes and task nodes.
Use Graviton instances for core nodes and task nodes.
Use Spot Instances for all primary nodes.
ユーザの投票
コメント(6)
- 正解だと思う選択肢: BD
HDFS is not recommended for persistent storage because once a cluster is terminated, all HDFS data is lost. Also, long-running workloads can fill the disk space quickly. Thus, S3 is the best option since it's highly available, durable, and scalable.
AWS Graviton-based instances cost up to 20% less than comparable x86-based Amazon EC2 instances: https://aws.amazon.com/ec2/graviton/
👍 7[Removed]2024/07/20 - 正解だと思う選択肢: BD
B and D.
👍 3GiorgioGss2024/09/10 - 正解だと思う選択肢: BD
s3 no question. Graviton=> Cost-Effectiveness: Graviton instances are ARM-based instances specifically designed for cloud workloads. They offer significant cost savings compared to x86-based instances while delivering comparable or better performance for many Apache Spark workloads. Performance: Graviton instances are optimized for Spark workloads and can deliver the same level of performance as x86-based instances in many cases. Additionally, EMR offers performance-optimized versions of Spark built for Graviton instances.
👍 3pypelyncar2024/12/08
シャッフルモード