VL-Rethinker
Collection
SoTA VLM for Reasoning
•
4 items
•
Updated
•
1
VL-Reasoner-72B achieves superior results on various multimodal reasoning benchmarks.
It is trained using the GRPO-SSR techniques, serving as the foundation for VL-Rethinker.
For details of our approach and performance comparison, please see our paper.
For details of training and evaluation, please see our code repo.
Explore further via the following links:
| 🚀Project Page | 📖Paper | 🔗Github | 🤗Data (Coming Soon) |
If you feel this model useful, please give us a free cite:
@article{vl-rethinker,
title={VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning},
author = {Wang, Haozhe and Qu, Chao and Huang, Zuming and Chu, Wei and Lin, Fangzhen and Chen, Wenhu},
journal={arXiv preprint arXiv:2504.08837},
year={2025}
}
Unable to build the model tree, the base model loops to the model itself. Learn more.