[LG]《SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning》B Liu, L Guertler, S Yu, Z Liu... [National University of Singapore & A*STAR & Northeastern University] (2025) http://t.cn/A6DjSqpJ #机器学习##人工智能##论文##AI创造营#
发布于 北京
