mikeogezi/Qwen2-VL-2B-GRPO-MMR-TrainedRationaleVerifier Image-Text-to-Text • 2B • Updated Mar 20 • 5