-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 27 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 28 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 117 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
Collections
Discover the best community collections!
Collections including paper arxiv:2503.12605
-
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 27 -
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Paper • 2503.12605 • Published • 30 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54 -
s1: Simple test-time scaling
Paper • 2501.19393 • Published • 114
-
Analyzing The Language of Visual Tokens
Paper • 2411.05001 • Published • 24 -
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Paper • 2411.14982 • Published • 16 -
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
Paper • 2411.17686 • Published • 20 -
On the Limitations of Vision-Language Models in Understanding Image Transforms
Paper • 2503.09837 • Published • 10
-
MotionLLM: Understanding Human Behaviors from Human Motions and Videos
Paper • 2405.20340 • Published • 20 -
Spectrally Pruned Gaussian Fields with Neural Compensation
Paper • 2405.00676 • Published • 10 -
Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Paper • 2404.18212 • Published • 29 -
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper • 2405.00732 • Published • 121