Examtopics

Professional Machine Learning Engineer

130
132

Topic 1 Question 131
You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?
- Weight pruning
- Dynamic range quantization
- Model distillation
- Dimensionality reduction
ユーザの投票
コメント(7)
- 正解だと思う選択肢: B
  B
  
  https://www.tensorflow.org/lite/performance/post_training_quantization#dynamic_range_quantization
  👍 3
  hiromi2022/12/21
- 正解だと思う選択肢: B
  'Without training a new model' --> B
  
  👍 3
  ares812023/01/03
- 正解だと思う選択肢: B
  The requirement is "Without training a new model" hence dynamic range quantization. https://www.tensorflow.org/lite/performance/post_training_quant
  
  👍 2
  mil_spyro2022/12/13
シャッフルモード

130
132