Topic 1 Question 184
You work for a retail company. You have a managed tabular dataset in Vertex AI that contains sales data from three different stores. The dataset includes several features, such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon. You need to split the data between the training, validation, and test sets. What approach should you use to split the data?
Use Vertex AI manual split, using the store name feature to assign one store for each set
Use Vertex AI default data split
Use Vertex AI chronological split, and specify the sales timestamp feature as the time variable
Use Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set
ユーザの投票
コメント(2)
- 正解だと思う選択肢: A
By using a manual split based on store names, you can train a model that is more sensitive to the unique characteristics of each store, ultimately leading to better predictions for the new store.
👍 1pikachu0072024/01/11 - 正解だと思う選択肢: C
Anything different than option C could potentially lead to data leakage imo.
👍 1b1a8fae2024/01/12
シャッフルモード