arxiv:2407.16154
Jie Liu
jieliu
AI & ML interests
Reinforcement Learning, Large Language Model
Organizations
models
7
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-chat-noval-beta0.5-bs24
Updated
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-chat-math-noval-beta0.5-bs24
Updated
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-longqa-beta0.5-bs24-seq2048
Updated
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-longqa-beta0.5-bs24
Updated
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-longqa-beta0.5
Updated
jieliu/Qwen2-7B-Instruct-DPO-score-diff-2-beta0.5
Updated
jieliu/Storm-7B
Text Generation
•
Updated
•
72
•
40
datasets
None public yet