Topic 1 Question 156

Professional Data Engineer

Topic 1 Question 156
You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for long-running batch jobs. You want to use a managed service. What should you do?
- Deploy a Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
- Deploy a Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
- Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances. Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://
- Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://
ユーザの投票
コメント(13)
- Correct: A
  
  Ask for cost effective so persistent disk are HDD which are cheaper in comparison to SSD.
  
  👍 29
  [Removed]2020/03/25
- Confused between A and B. For r/w intensive jobs need to use SSDs. But questions doesnt state anything about the nature of the jobs. So better to start with a default option. Choose A
  
  👍 14
  [Removed]2020/03/21
- Answer is A…Cloud Dataproc for Managed Cloud native application and HDD for cost-effective solution.
  
  👍 7
  Rajuuu2020/07/06
シャッフルモード

ユーザの投票

コメント(13)