---
library_name: transformers
license: mit
pipeline_tag: question-answering
---

ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone, as described in [Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions](https://huggingface.co/papers/2506.07527).

Code: https://github.com/TheRoadQaQ/ReLIFT
Project page: https://github.com/TheRoadQaQ/ReLIFT