×
img

雪貂:在任何粒度的任何地方引用和研磨任何东西论文(英文版)

发布者:wx****da
2024-01-04
26 MB 30 页
科技
文件列表:
雪貂:在任何粒度的任何地方引用和研磨任何东西论文(英文版).pdf
下载文档

Ferret enables referring and grounding capabilities for multimodal large language model (LLM). In terms of referring, a user can refer to a region or an object in point, box, or any free-form shape. The regionN in the input will be replaced by the proposed hybrid representation before being fed into the LLM. In terms of grounding, Ferret is able to accurately ground any open-vocabulary descriptions. The boxN in the output denotes the predicted bounding box coordinates.


加载中...

本文档仅能预览20页

继续阅读请下载文档

网友评论>

开通智库会员享超值特权
专享文档
免费下载
免广告
更多特权
立即开通

发布机构

更多>>