Topic 1 Question 79
A company has built a solution by using generative AI. The solution uses large language models (LLMs) to translate training manuals from English into other languages. The company wants to evaluate the accuracy of the solution by examining the text generated for the manuals. Which model evaluation strategy meets these requirements?
Bilingual Evaluation Understudy (BLEU)
Root mean squared error (RMSE)
Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
F1 score
ユーザの投票
コメント(6)
- 正解だと思う選択肢: A
BLEU (bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another.
👍 2Amitst2024/12/05 - 正解だと思う選択肢: A
BLEU is specifically designed to measure the quality of machine translations by comparing them to human-created reference translations
👍 2Dandelion20252024/12/07 - 正解だと思う選択肢: C
C. Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
ROUGE is a popular metric for evaluating the quality of text summarization and machine translation systems. It focuses on recall, measuring how well the generated text covers the relevant information from the reference text. In this case, ROUGE can be used to assess how accurately the LLM-generated translations capture the meaning and content of the original English manuals.
👍 1aws4myself2024/12/05
シャッフルモード