Topic 1 Question 47
A social media company wants to use a large language model (LLM) for content moderation. The company wants to evaluate the LLM outputs for bias and potential discrimination against specific groups or individuals. Which data source should the company use to evaluate the LLM outputs with the LEAST administrative effort?
User-generated content
Moderation logs
Content moderation guidelines
Benchmark datasets
ユーザの投票
コメント(3)
- 正解だと思う選択肢: D
Benchmark datasets are specifically designed to test the performance of language models on various tasks, including bias detection. They often contain diverse data that can help identify potential biases in the LLM's outputs.
👍 2jove2024/11/05 - 正解だと思う選択肢: D
Least administrative effort: Benchmark datasets are pre-existing, curated collections of data specifically designed for evaluating AI models, including LLMs. Using these requires the least administrative effort compared to the other options.
👍 1Blair772024/11/12 - 正解だと思う選択肢: D
Benchmark datasets: Benchmark datasets are specifically designed for evaluating models on specific tasks, including fairness and bias. These datasets typically include a wide range of content and scenarios designed to assess how well the model handles various forms of bias or discrimination. Using these datasets will provide the least administrative effort because they are pre-structured and widely recognized for evaluating model behavior across a variety of contexts.
👍 1Jessiii2025/02/11
シャッフルモード