Offline-GRPO Collection of LLMs continually post-trained via offline GRPO to enhance mathematical reasoning capabilities. KRAFTON/OpenThinker3-Offline-GRPO-7B 8B • Updated Aug 8 • 6 • 5 KRAFTON/AceReason-Nemotron-1.1-Offline-GRPO-7B 8B • Updated Aug 8 • 8 • 3 KRAFTON/OpenThinker2-Offline-GRPO-7B 8B • Updated Aug 8 • 4 • 3
Offline-GRPO Collection of LLMs continually post-trained via offline GRPO to enhance mathematical reasoning capabilities. KRAFTON/OpenThinker3-Offline-GRPO-7B 8B • Updated Aug 8 • 6 • 5 KRAFTON/AceReason-Nemotron-1.1-Offline-GRPO-7B 8B • Updated Aug 8 • 8 • 3 KRAFTON/OpenThinker2-Offline-GRPO-7B 8B • Updated Aug 8 • 4 • 3