爱可可-爱生活 25-07-02 05:36
微博认证:AI博主 2025微博新锐新知博主

[LG]《SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning》B Liu, L Guertler, S Yu, Z Liu... [National University of Singapore & A*STAR & Northeastern University] (2025) http://t.cn/A6DjSqpJ #机器学习##人工智能##论文##AI创造营#

发布于 北京