Topic 1 Question 62
While conducting an exploratory analysis of a dataset, you discover that categorical feature A has substantial predictive power, but it is sometimes missing. What should you do?
Drop feature A if more than 15% of values are missing. Otherwise, use feature A as-is.
Compute the mode of feature A and then use it to replace the missing values in feature A.
Replace the missing values with the values of the feature with the highest Pearson correlation with feature A.
Add an additional class to categorical feature A for missing values. Create a new binary feature that indicates whether feature A is missing.
ユーザの投票
コメント(11)
- 正解だと思う選択肢: D
ans: D
A => no, you don't want to drop a feature with high prediction power. B => i think this could confuse the model... a better solution could be to fill missing values using an algorithm like Expectation Maximization, but using the mode i think is a bad idea in this case, because if you have a significant number of missing values (for example >10%) this would modify the "predictive power". you don't want to lose predictive power of a feature, just guide the model to learn when to use that feature and when to ignore it. C => this doesn't make any sense for me. not sure what i would do that. D => i think this could be a really good approach, and i'm pretty sure it would work pretty well a lot of models. the model would learn that when "is_available_feat_A" == True, then it would use the feature A, but whenever it is missing then it would try to use other features.
👍 9wish00352022/12/15 - 正解だと思う選択肢: B
B "For categorical variables, we can usually replace missing values with mean, median, or most frequent values" Dr. Logan Song - Journey to Become a Google Cloud Machine Learning Engineer - Page 48
👍 4hiromi2022/12/16 - 正解だと思う選択肢: B
I agree with B
👍 3LearnSodas2022/12/10
シャッフルモード