Topic 1 Question 62

Professional Machine Learning Engineer

Topic 1 Question 62
While conducting an exploratory analysis of a dataset, you discover that categorical feature A has substantial predictive power, but it is sometimes missing. What should you do?
- Drop feature A if more than 15% of values are missing. Otherwise, use feature A as-is.
- Compute the mode of feature A and then use it to replace the missing values in feature A.
- Replace the missing values with the values of the feature with the highest Pearson correlation with feature A.
- Add an additional class to categorical feature A for missing values. Create a new binary feature that indicates whether feature A is missing.
ユーザの投票
コメント(11)
- 正解だと思う選択肢: D
  ans: D
  
  A => no, you don't want to drop a feature with high prediction power. B => i think this could confuse the model... a better solution could be to fill missing values using an algorithm like Expectation Maximization, but using the mode i think is a bad idea in this case, because if you have a significant number of missing values (for example >10%) this would modify the "predictive power". you don't want to lose predictive power of a feature, just guide the model to learn when to use that feature and when to ignore it. C => this doesn't make any sense for me. not sure what i would do that. D => i think this could be a really good approach, and i'm pretty sure it would work pretty well a lot of models. the model would learn that when "is_available_feat_A" == True, then it would use the feature A, but whenever it is missing then it would try to use other features.
  
  👍 9
  wish00352022/12/15
- 正解だと思う選択肢: B
  B "For categorical variables, we can usually replace missing values with mean, median, or most frequent values" Dr. Logan Song - Journey to Become a Google Cloud Machine Learning Engineer - Page 48
  
  👍 4
  hiromi2022/12/16
- 正解だと思う選択肢: B
  I agree with B
  
  👍 3
  LearnSodas2022/12/10
シャッフルモード

ユーザの投票

コメント(11)