RoadQAQ
/

ReLIFT-Qwen2.5-Math-7B-Zero

Question Answering

text-generation

text-generation-inference

Model card Files Files and versions Community

This repository contains the ReLIFT model presented in Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions.

Code: https://github.com/TheRoadQaQ/ReLIFT

Hugging Face Collection: https://huggingface.co/collections/RoadQAQ/relift-684535e199a909cad16d8b05

Downloads last month: 6

Safetensors

Model size

7.62B params

Tensor type

F32

·

Inference Providers NEW

Question Answering

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

ReLIFT

ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone. • 8 items • Updated 3 days ago • 1