Topic 1 Question 78
You are creating a deep neural network classification model using a dataset with categorical input values. Certain columns have a cardinality greater than 10,000 unique values. How should you encode these categorical values as input into the model?
Convert each categorical value into an integer value.
Convert the categorical string data to one-hot hash buckets.
Map the categorical variables into a vector of boolean values.
Convert each categorical value into a run-length encoded string.
ユーザの投票
コメント(12)
- 正解だと思う選択肢: B👍 4hiromi2022/12/18
- 正解だと思う選択肢: A
Answer A since with 10.000 unique values one-hot shouldn't be a good solution https://machinelearningmastery.com/how-to-prepare-categorical-data-for-deep-learning-in-python/
👍 2LearnSodas2022/12/11 - 正解だと思う選択肢: B
B unconditoinally https://cloud.google.com/ai-platform/training/docs/algorithms/xgboost#analysis
If the column is categorical with high cardinality, then the column is treated with hashing, where the number of hash buckets equals to the square root of the number of unique values in the column. A categorical column is considered to have high cardinality if the number of unique values is greater than the square root of the number of rows in the dataset.
👍 2John_Pongthorn2023/01/26
シャッフルモード