ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
马路
RoadQAQ
AI & ML interests
None yet
Recent Activity
new activity
1 day ago
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero:Add model card
new activity
1 day ago
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero:Add model card with metadata and links
new activity
1 day ago
RoadQAQ/ReLIFT-Qwen2.5-7B-Zero:Add pipeline tag, link to the paper and project page
Organizations
None yet
Collections
1
Papers
1
models
8
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero
Question Answering
•
Updated
•
6
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero
Question Answering
•
Updated
•
8
RoadQAQ/ReLIFT-Qwen2.5-7B-Zero
Question Answering
•
Updated
•
26
•
2
RoadQAQ/Qwen2.5-Math-1.5B-16k-think
Text Generation
•
Updated
•
43
RoadQAQ/Qwen2.5-7B-think
Text Generation
•
Updated
•
1
RoadQAQ/Qwen2.5-Math-7B-16k-think
Text Generation
•
Updated
•
6
RoadQAQ/OpenR1-Distill-7B
Updated
RoadQAQ/video_llm_template
Updated