Topic 1 Question 336
A media company wants to deploy a machine learning (ML) model that uses Amazon SageMaker to recommend new articles to the company’s readers. The company's readers are primarily located in a single city.
The company notices that the heaviest reader traffic predictably occurs early in the morning, after lunch, and again after work hours. There is very little traffic at other times of day. The media company needs to minimize the time required to deliver recommendations to its readers. The expected amount of data that the API call will return for inference is less than 4 MB.
Which solution will meet these requirements in the MOST cost-effective way?
Real-time inference with auto scaling
Serverless inference with provisioned concurrency
Asynchronous inference
A batch transform task
ユーザの投票
コメント(5)
- 正解だと思う選択肢: B
On-demand Serverless Inference is ideal for workloads which have idle periods between traffic spurts.Optionally, you can also use Provisioned Concurrency with Serverless Inference. Serverless Inference with provisioned concurrency is a cost-effective option when you have predictable bursts in your traffic. https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html
👍 2Tkhan12024/09/18 - 正解だと思う選択肢: B
By choosing serverless inference with provisioned concurrency, the media company can benefit from low latency during peak traffic periods while optimizing costs by only paying for the actual inference requests
👍 1GS_772024/09/07 - 正解だと思う選択肢: A
The traffic is expected. Provisioned resouces have minimal cost.
👍 1luccabastos2024/09/13
シャッフルモード