字节跳动：2025年视觉-语言多模态大模型Seed1.5-VL技术报告（英文版）

发布者：wx****57

2025-08-21

31 MB 77 页

互联网

文件列表：

字节跳动：2025年视觉-语言多模态大模型Seed1.5-VL技术报告（英文版）.pdf

下载文档

资源简介

We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B active parameters. Despite its relatively compact architecture, it delivers strong performance across a wide spectrum of public VLM benchmarks and internal evaluation suites, achieving the state-of-the-art performance on 38 out of 60 public benchmarks. Moreover

加载中...

本文档仅能预览20页

继续阅读请下载文档