Topic 1 Question 278

Professional Machine Learning Engineer

Topic 1 Question 278
You need to train an XGBoost model on a small dataset. Your training code requires custom dependencies. You want to minimize the startup time of your training job. How should you set up your Vertex AI custom training job?
- Store the data in a Cloud Storage bucket, and create a custom container with your training application. In your training application, read the data from Cloud Storage and train the model.
- Use the XGBoost prebuilt custom container. Create a Python source distribution that includes the data and installs the dependencies at runtime. In your training application, load the data into a pandas DataFrame and train the model.
- Create a custom container that includes the data. In your training application, load the data into a pandas DataFrame and train the model.
- Store the data in a Cloud Storage bucket, and use the XGBoost prebuilt custom container to run your training application. Create a Python source distribution that installs the dependencies at runtime. In your training application, read the data from Cloud Storage and train the model.
ユーザの投票
コメント(9)
- 正解だと思う選択肢: A
  My Answer: A
  
  Focus on “training code requires custom dependencies” and “ minimize the startup time of your training job”, the best choice is A because use custom container and read the data from GCS is he faster way
  
  👍 5
  guilhermebutzke2024/02/19
- 正解だと思う選択肢: A
  Given the focus on minimizing startup time, and based on the information about XGBoost prebuilt container dependencies available here https://cloud.google.com/vertex-ai/docs/training/pre-built-containers#xgboost
  
  A: Separate Data and Custom Container is the best approach for minimizing startup time, especially for small datasets. Separating data in Cloud Storage keeps the container image lean, leading to faster download and startup compared to bundling data within the container. B. The prebuilt Container could have unnecessary components, potentially increasing the image size and impacting startup time.
  
  👍 4
  omermahgoub2024/04/13
- 正解だと思う選択肢: B
  B
  
  XGBoost prebuilt customer container already includes XGBoost library and all of its dependencies. Python source distribution to avoid overhead of reading the data from Cloud storage the 2nd time. Load data to a Pandas DataFrame is convenient to work with Python. Pandas is for data analysis and manipulation.
  
  👍 3
  Yan_X2024/03/08
シャッフルモード

ユーザの投票

コメント(9)