Topic 1 Question 106
You are building a linear regression model on BigQuery ML to predict a customer’s likelihood of purchasing your company’s products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?
Use TensorFlow to create a categorical variable with a vocabulary list. Create the vocabulary file, and upload it as part of your model to BigQuery ML.
Create a new view with BigQuery that does not include a column with city information
Use Cloud Data Fusion to assign each city to a region labeled as 1, 2, 3, 4, or 5, and then use that number to represent the city in the model.
Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.
ユーザの投票
コメント(11)
- 正解だと思う選択肢: C
I vote for C
👍 2hiromi2022/12/20 - 正解だと思う選択肢: D
one-hot encoding makes sense to me
👍 2mymy94182022/12/21 - 正解だと思う選択肢: D
One-hot is a good way to use categorical variables in regressions problems https://academic.oup.com/rheumatology/article/54/7/1141/1849688 https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-auto-preprocessing
👍 2guilhermebutzke2023/03/14
シャッフルモード