文件列表:
大型语言模型能否替代人类评估?【英文版】.pdf |
下载文档 |
资源简介
>
英文标题:Can Large Language Models Be an Alternative to Human Evaluations?中文摘要:本文介绍了使用大型语言模型(LLM)代替人类评估来评估人工智能生成的文本的潜力,探索了 LLM 对两个自然语言处理任务的开放性故事生成和对抗性攻击的评估结果,并发现 LLM 评估结果与人类专家的评估结果保持一致。英文摘要:Human evaluation is indispensable and inevitable for assessing the quality oftexts generated by machine learning models or written by humans. However, humanevaluation is very difficult to reproduce and its quality is notoriouslyunstable, hindering fair comparisons among different natural languageprocessing (NLP)
加载中...
本文档仅能预览20页