Topic 1 Question 313

Professional Data Engineer

Topic 1 Question 313
You want to migrate an Apache Spark 3 batch job from on-premises to Google Cloud. You need to minimally change the job so that the job reads from Cloud Storage and writes the result to BigQuery. Your job is optimized for Spark, where each executor has 8 vCPU and 16 GB memory, and you want to be able to choose similar settings. You want to minimize installation and management effort to run your job. What should you do?
- Execute the job as part of a deployment in a new Google Kubernetes Engine cluster.
- Execute the job from a new Compute Engine VM.
- Execute the job in a new Dataproc cluster.
- Execute as a Dataproc Serverless job.
ユーザの投票
コメント(7)
- 正解だと思う選択肢: D
  Priority is "minimize installation and management effort" which is done via Dataproc Serverless. Furthermore, with Dataproc serverless you can still specify resource settings for your job, such as the number of vCPUs and memory per executor (https://cloud.google.com/dataproc-serverless/docs/concepts/properties)
  
  👍 8
  chicity_de2024/12/10
- 正解だと思う選択肢: C
  Dataproc supports Spark 3, ensuring compatibility with your existing job.
  
  It also allows you to customize the cluster configuration, including the number of executors, vCPUs, and memory per executor, to match your on-premises setup (8 vCPU and 16 GB memory)
  
  👍 1
  mcdaley2024/12/07
- 正解だと思う選択肢: D
  Dataproc Serverless allows you to run Spark jobs without needing to manage the underlying infrastructure. It automatically handles resource provisioning and scaling, which simplifies the process and reduces management overhead
  
  👍 1
  Pime132025/01/06
シャッフルモード

ユーザの投票

コメント(7)