--- library_name: transformers license: mit pipeline_tag: question-answering --- ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone, as described in [Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions](https://huggingface.co/papers/2506.07527). Code: https://github.com/TheRoadQaQ/ReLIFT Project page: https://github.com/TheRoadQaQ/ReLIFT