字节跳动:2025年Seedream4.0技术报告:迈向下一代多模态图像生成(英文版).pdf |
下载文档 |
资源简介
We introduce Seedream 4.0, an efficient and high-performance multimodal image generation system that unifies text-to-image (T2I) synthesis, image editing, and multi-image composition within a single framework. We develop a highly efficient diffusion transformer with a powerful VAE which also can reduce the number of image tokens considerably. This allows for efficient training of our model, and enables it to fast generate native high-resolution images (e.g., 1K-4K). Seedream 4.0 is pretrained
已阅读到文档的结尾了