internlm/CapRL-Qwen3VL-2B
Image-Text-to-Text
•
2B
•
Updated
•
13
•
2
None defined yet.
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Think Visually, Reason Textually: Vision-Language Synergy in ARC