文件列表:
利用分解注意力的单层变换器对广义 Potts 模型进行最优推断【英文版】.pdf |
下载文档 |
资源简介
>
英文标题:Optimal inference of a generalised Potts model by single-layer transformers with factored attention中文摘要:通过对来自一个广义 Potts 模型的数据进行学习,我们证明了带一点修改的自注意力单层可以在无限采样的极限下精确地学习这个分布,这种修改后的自注意力具有与条件概率相同的功能形式。英文摘要:Transformers are the type of neural networks that has revolutionised naturallanguage processing and protein science. Their key building block is amechanism called self-attention which is trained to predict missing words insentences. Despite the practical success of transformers in applications i
加载中...
已阅读到文档的结尾了