-
spiral-rl/Spiral-Qwen3-4B
Text Generation • 4B • Updated • 34 • 4 -
spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B
Text Generation • 8B • Updated • 17 • 2 -
spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT
Viewer • Updated • 25.5k • 71 -
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
Paper • 2506.24119 • Published • 50
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
spiral-rl/Spiral-Qwen3-4B
Text Generation • 4B • Updated • 34 • 4 -
spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B
Text Generation • 8B • Updated • 17 • 2 -
spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT
Viewer • Updated • 25.5k • 71 -
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
Paper • 2506.24119 • Published • 50
models
6
spiral-rl/Spiral-Octothinker-8B-Multi-Env
Text Generation
•
8B
•
Updated
•
27
spiral-rl/Spiral-Llama3-8B-Multi-Env
Text Generation
•
8B
•
Updated
•
31
spiral-rl/Spiral-Qwen3-8B-Multi-Env
Text Generation
•
8B
•
Updated
•
34
•
1
spiral-rl/Spiral-Qwen3-4B-Multi-Env
Text Generation
•
4B
•
Updated
•
40
spiral-rl/Spiral-Qwen3-4B
Text Generation
•
4B
•
Updated
•
34
•
4
spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B
Text Generation
•
8B
•
Updated
•
17
•
2