文件列表:
动词在行动:提高视频语言模型中动词理解能力【英文版】.pdf |
下载文档 |
资源简介
>
英文标题:Verbs in Action: Improving verb understanding in video-language models中文摘要:本研究提出了一个新的以动词为中心对比学习(Verb-Focused Contrastive,VFC)框架,以改善基于 CLIP 的视频语言模型的动词理解。该方法采用预训练的大型语言模型(LLMs)创建难样本进行跨模态对比学习,以及实施细粒度的动词短语对齐损失。该方法在三个下游任务上实现了零射击性能的最新成果,包括视频文本匹配、视频问答和视频分类。英文摘要:Understanding verbs is crucial to modelling how people and objects interactwith each other and the environment through space and time. Recently,state-of-the-art video-language models based on CLIP have been shown to havelimited verb understa
加载中...
本文档仅能预览20页