Reinforcement Learning Teachers
Collection
Students distilled from a 7B Reinforcement-Learned Teacher (RLT) from the paper "Reinforcement Learning Teachers of Test Time Scaling."
•
2 items
•
Updated
•
7