Topic 1 Question 131
You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?
Weight pruning
Dynamic range quantization
Model distillation
Dimensionality reduction
ユーザの投票
コメント(7)
- 正解だと思う選択肢: B👍 3hiromi2022/12/21
- 正解だと思う選択肢: B
'Without training a new model' --> B
👍 3ares812023/01/03 - 正解だと思う選択肢: B
The requirement is "Without training a new model" hence dynamic range quantization. https://www.tensorflow.org/lite/performance/post_training_quant
👍 2mil_spyro2022/12/13
シャッフルモード