Topic 1 Question 159
3 つ選択A company that manufactures mobile devices wants to determine and calibrate the appropriate sales price for its devices. The company is collecting the relevant data and is determining data features that it can use to train machine learning (ML) models. There are more than 1,000 features, and the company wants to determine the primary features that contribute to the sales price. Which techniques should the company use for feature selection?
Data scaling with standardization and normalization
Correlation plot with heat maps
Data binning
Univariate selection
Feature importance with a tree-based classifier
Data augmentation
解説
Reference: https://towardsdatascience.com/an-overview-of-data-preprocessing-features-enrichment-automatic-feature-selection-60b0c12d75ad https://towardsdatascience.com/feature-selection-using-python-for-classification-problem-b5f00a1c7028#:~:text=Univariate%20feature%20selection%20works% 20by,analysis%20of%20variance%20(ANOVA).&text=That%20is%20why%20it%20is%20called%20'univariate ' https://arxiv.org/abs/2101.04530
ユーザの投票
コメント(6)
- 正解だと思う選択肢: BDE
i will go for B, D and E. B and D for me are like doing partial regression and corr plot can actually tell you briefly how well the univariate is correlated with your target and i guess that also apply for D.. and E , feature importance ranking that's what feature selection strategy want from my POV. And for Data Binning is data enrichment just like augmentations , but then the question was saying they want to do feature selection over 1k+ variables which implies they actually care more about which variable(s) can contribute more on determining the price ?
👍 17ckkobe242022/05/01 BDE for me
👍 4Morsa2022/07/14- 正解だと思う選択肢: BDE
B. Correlation plot with heat maps: This technique can be used to identify the relationship between each feature and the target variable (sales price). By creating a correlation plot with heat maps, the company can quickly visualize the strength and direction of the relationship between each feature and the target variable.
D. Univariate selection: This technique can be used to select the features that have the strongest relationship with the target variable. It involves analyzing each feature independently and selecting the ones that have the highest correlation with the target variable.
E. Feature importance with a tree-based classifier: This technique can be used to determine the most important features that contribute to the target variable. By using a tree-based classifier such as Random Forest or Gradient Boosting, the company can rank the importance of each feature and select the ones that have the highest importance.
👍 4AjoseO2023/02/17
シャッフルモード