国际货币基金组织：2024强化经验反馈学习：在经济政策中的应用报告（英文版）

发布者：wx****e3

2024-06-18

1 MB 23 页

经济联合国粮农组织

文件列表：

联合国粮农组织：2024强化经验反馈学习：在经济政策中的应用报告（英文版）.pdf

下载文档

资源简介

Learning from the past is critical for shaping the future, especially when it comes to economic policymaking. Building upon the current methods in the application of Reinforcement Learning (RL) to the large language models (LLMs), this paper introduces Reinforcement Learning from Experience Feedback (RLXF), a procedure that tunes LLMs based on lessons from past experiences. RLXF integrates historical experiences into LLM training in two key ways - by training reward models on historical data, a

加载中...

本文档仅能预览20页

继续阅读请下载文档