The base Qwen2.5-Math-1.5B model used by ReLIFT. We change to rope_theta from 10000 to 40000 and extend the context window to 16k. Also, we modify the chat_template for the system prompt and add .

Github: https://github.com/TheRoadQaQ/ReLIFT

Citation

If you find our model, data, or evaluation code useful, please kindly cite our paper:

Downloads last month: 48

Safetensors

Model size

1.54B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including RoadQAQ/Qwen2.5-Math-1.5B-16k-think

ReLIFT

Collection

ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone. • 8 items • Updated 4 days ago • 1