Examtopics

Professional Data Engineer
  • Topic 1 Question 27

    You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing some features while having a minimum effect on model accuracy. What can you do?

    • Eliminate features that are highly correlated to the output labels.

    • Combine highly co-dependent features into one representative feature.

    • Instead of feeding in each feature individually, average their values in batches of 3.

    • Remove the features that have null values for more than 50% of the training records.


    シャッフルモード