Topic 1 Question 10
A company wants to use language models to create an application for inference on edge devices. The inference must have the lowest latency possible. Which solution will meet these requirements?
Deploy optimized small language models (SLMs) on edge devices.
Deploy optimized large language models (LLMs) on edge devices.
Incorporate a centralized small language model (SLM) API for asynchronous communication with edge devices.
Incorporate a centralized large language model (LLM) API for asynchronous communication with edge devices.
ユーザの投票
コメント(8)
Using Optimized Small Language Models (SLMs) on edge devices is the best choice because they are designed to run efficiently within the resource constraints of edge hardware. This minimizes latency and helps deliver fast inference times while using less computational power and memory. The problem with trying to use centralized APIs is the associated latentcy.
👍 4galliaj2024/11/01- 正解だと思う選択肢: A
A: Deploy optimized small language models (SLMs) on edge devices.
Explanation: Deploying optimized small language models (SLMs) on edge devices ensures low latency because the inference happens directly on the device without relying on cloud communication. Small language models are lightweight and designed to run efficiently on devices with limited resources, making them ideal for edge computing.
👍 4Moon2024/12/30 - 正解だと思う選択肢: A
SLM on edge devices is the correct solution.
👍 2tccusa2024/11/03
シャッフルモード