Topic 1 Question 152
A machine learning (ML) specialist must develop a classification model for a financial services company. A domain expert provides the dataset, which is tabular with 10,000 rows and 1,020 features. During exploratory data analysis, the specialist finds no missing values and a small percentage of duplicate rows. There are correlation scores of > 0.9 for 200 feature pairs. The mean value of each feature is similar to its 50th percentile. Which feature engineering strategy should the ML specialist use with Amazon SageMaker?
Apply dimensionality reduction by using the principal component analysis (PCA) algorithm.
Drop the features with low correlation scores by using a Jupyter notebook.
Apply anomaly detection by using the Random Cut Forest (RCF) algorithm.
Concatenate the features with high correlation scores by using a Jupyter notebook.
ユーザの投票
コメント(7)
- 正解だと思う選択肢: A
Dimensions are too high. Use PCA
👍 8ovokpus2022/06/25 A should be the answer to avoid the curse of dimensionality
👍 6LydiaGom2022/05/09- 正解だと思 う選択肢: A
I think it's A.
👍 4DJiang2022/05/08
シャッフルモード