Topic 1 Question 90
You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?
F-score where recall is weighed more than precision
RMSE
F1 score
F-score where precision is weighed more than recall
ユーザの投票
コメント(12)
I think A. If D were the answer, the threshold would be set higher to increase PRECISION, but the low percentage of positives (4%) would allow RECALL to be extremely low. If the percentage of positives is low, greater weight should be given to RECALL. https://medium.com/@douglaspsteen/beyond-the-f-1-score-a-look-at-the-f-beta-score-3743ac2ef6e3
👍 4kn292022/12/27- 正解だと思う選択肢: A
In this scenario, the dataset is highly imbalanced, where most of the examples do not have the company's logo. Therefore, accuracy could be misleading as the model can have high accuracy by simply predicting that all images do not have the logo. F1 score is a good metric to consider in such cases, as it takes both precision and recall into account. However, since the dataset is highly skewed, we should weigh recall more than precision to ensure that the model is correctly identifying the images that do have the logo. Therefore, F-score where recall is weighed more than precision is the best metric to evaluate the performance of the model in this scenario. Option B (RMSE) is not applicable to this classification problem, and option D (F-score where precision is weighed more than recall) is not suitable for highly skewed datasets.
👍 4tavva_prudhvi2023/03/23 Answer C: F1-Score is the best for imbalanced Data like this case: https://stephenallwright.com/imbalanced-data-metric/
👍 3egdiaa2022/12/23
シャッフルモード