文件列表:
视觉指令调整【英文版】.pdf |
下载文档 |
资源简介
>
英文标题:Visual Instruction Tuning中文摘要:本文利用语言模型 GPT-4 生成多模态图文指令序列来优化多模态模型,得到了新的模型 LLaVA 并在多个数据集上表现出色。英文摘要:Instruction tuning large language models (LLMs) using machine-generatedinstruction-following data has improved zero-shot capabilities on new tasks,but the idea is less explored in the multimodal field. In this paper, wepresent the first attempt to use language-only GPT-4 to generate multimodallanguage-image instruction-following data. By instruction tuning on suchgenerated data, we introduce LLa
加载中...
已阅读到文档的结尾了