Topic 1 Question 280
You are using Kubeflow Pipelines to develop an end-to-end PyTorch-based MLOps pipeline. The pipeline reads data from BigQuery, processes the data, conducts feature engineering, model training, model evaluation, and deploys the model as a binary file to Cloud Storage. You are writing code for several different versions of the feature engineering and model training steps, and running each new version in Vertex AI Pipelines. Each pipeline run is taking over an hour to complete. You want to speed up the pipeline execution to reduce your development time, and you want to avoid additional costs. What should you do?
Comment out the part of the pipeline that you are not currently updating.
Enable caching in all the steps of the Kubeflow pipeline.
Delegate feature engineering to BigQuery and remove it from the pipeline.
Add a GPU to the model training step.
ユーザの投票
コメント(3)
- 正解だと思う選択肢: B
B 'Different version of feature engineering and model training', so enable cache can help to reuse results of previous run. Guess not be C, as it mentioned 'end-to-end' MLOps, if delegate to BigQuery, it is not 'end-to-end' now.
👍 7Yan_X2024/09/08 Answer is C
👍 1JG1232024/09/05B, and here's why:
- Caching directly addresses the issue of redundant computations, especially for frequently used feature engineering versions
- End-to-End" MLOps, Kubeflow Pipelines handle all stages, including feature engineering, maintaining your desired "end-to-end" workflow.
👍 1omermahgoub2024/10/13
シャッフルモード