ReLIFT
Collection
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
•
8 items
•
Updated
•
1
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone, as described in Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions.
Code: https://github.com/TheRoadQaQ/ReLIFT Project page: https://github.com/TheRoadQaQ/ReLIFT