Topic 1 Question 218
You have built a custom model that performs several memory-intensive preprocessing tasks before it makes a prediction. You deployed the model to a Vertex AI endpoint, and validated that results were received in a reasonable amount of time. After routing user traffic to the endpoint, you discover that the endpoint does not autoscale as expected when receiving multiple requests. What should you do?
Use a machine type with more memory
Decrease the number of workers per machine
Increase the CPU utilization target in the autoscaling configurations.
Decrease the CPU utilization target in the autoscaling configurations
ユーザの投票
コメント(1)
- 正解だと思う選択肢: A
B. Decreasing Workers: This might reduce memory usage per machine but could also decrease overall throughput, potentially impacting performance. C. Increasing CPU Utilization Target: This wouldn't directly address the memory bottleneck and could trigger unnecessary scaling based on CPU usage, not memory requirements. D. Decreasing CPU Utilization Target: This could lead to premature scaling, potentially increasing costs without addressing the root cause.
👍 1pikachu0072024/01/12
シャッフルモード