小红书：dots.llm1技术（英文版）

发布者：wx****49

2025-06-17

2 MB 21 页

文件列表：

小红书：dots.llm1技术（英文版）.pdf

资源简介

Mixture of Experts (MoE) models have emerged as a promising paradigm for scaling language models efficiently by activating only a subset of parameters for each input token. In this report, we present dots.llm1, a large-scale MoE model that activates 14 billion parameters out of a total of 142 billion parameters, delivering performance on par with state-of-the-art models while reducing training and inference costs. Leveraging our meticulously crafted and efficient data processing pipeline, dot

加载中...

本文档仅能预览20页

继续阅读请下载文档