Examtopics

AWS Certified Machine Learning - Specialty
  • Topic 1 Question 352

    A data scientist is conducting exploratory data analysis (EDA) on a dataset that contains information about product suppliers. The dataset records the country where each product supplier is located as a two-letter text code. For example, the code for New Zealand is "NZ."

    The data scientist needs to transform the country codes for model training. The data scientist must choose the solution that will result in the smallest increase in dimensionality. The solution must not result in any information loss.

    Which solution will meet these requirements?

    • Add a new column of data that includes the full country name.

    • Encode the country codes into numeric variables by using similarity encoding.

    • Map the country codes to continent names.

    • Encode the country codes into numeric variables by using one-hot encoding.


    シャッフルモード