Topic 1 Question 287

Professional Machine Learning Engineer

Topic 1 Question 287
You are tasked with building an MLOps pipeline to retrain tree-based models in production. The pipeline will include components related to data ingestion, data processing, model training, model evaluation, and model deployment. Your organization primarily uses PySpark-based workloads for data preprocessing. You want to minimize infrastructure management effort. How should you set up the pipeline?
- Set up a TensorFlow Extended (TFX) pipeline on Vertex AI Pipelines to orchestrate the MLOps pipeline. Write a custom component for the PySpark-based workloads on Dataproc.
- Set up a Vertex AI Pipelines to orchestrate the MLOps pipeline. Use the predefined Dataproc component for the PySpark-based workloads.
- Set up Kubeflow Pipelines on Google Kubernetes Engine to orchestrate the MLOps pipeline. Write a custom component for the PySparkbased workloads on Dataproc.
- Set up Cloud Composer to orchestrate the MLOps pipeline. Use Dataproc workflow templates for the PySpark-based workloads in Cloud Composer.
ユーザの投票
コメント(4)
- 正解だと思う選択肢: B
  B) Best option due to higher ease of use, integration with existing PySpark infrastructure (via Dataproc) and minimal infrastructure management overhead, because: Vertex AI Pipelines is fully managed, minimizing infra management effort and natively integrated with Dataproc for PySpark (while Composer is not); Dataproc’s predefined component for PySpark workload reduces effort and error probability; It is suitable for tree-based models (other options are too, but with more effort)
  
  👍 2
  carolctech2024/10/26
- 正解だと思う選択肢: B
  This is the most suitable approach
  
  👍 2
  AB_C2024/11/27
- 正解だと思う選択肢: B
  A- Rejected due to component for the PySpark-based C- Kubeflow Pipelines not a managed service and the question mentions 'minimize infrastructure management effort' D-
  
  👍 1
  Omi_040402024/12/09
シャッフルモード

ユーザの投票

コメント(4)