×
img

关于 LayerNorm 在 Transformer 注意力机制中表现力的作用(英文版)

发布者:wx****63
2023-05-05
444 KB 9 页
人工智能(AI)
文件列表:
关于 LayerNorm 在 Transformer 注意力机制中表现力的作用【英文版】.pdf
下载文档
英文标题:On the Expressivity Role of LayerNorm in Transformers' Attention中文摘要:本文表明,LayerNorm 是 Transformer 模型中 multi-head attention 层表现力的重要组成部分,其投影和缩放两个步骤对于注意力机制的作用至关重要。英文摘要:Layer Normalization (LayerNorm) is an inherent component in allTransformer-based models. In this paper, we show that LayerNorm is crucial tothe expressivity of the multi-head attention layer that follows it. This is incontrast to the common belief that LayerNorm's only role is to normalize theactivations during the forward pass,

加载中...

已阅读到文档的结尾了

下载文档

网友评论>