pietro0hz/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-durable_pensive_piranha Text Generation • 0.5B • Updated Apr 21 • 1
secmlr/DS-Noisy_DS-Clean_QWQ-Noisy_QWQ-Clean_Qwen2.5-0.5B-Instruct_full_sft_1e-5 Text Generation • 0.5B • Updated Apr 22
LifelongAlignment/aifgen-piecewise-preference-shift-0-reward-model Reinforcement Learning • 0.5B • Updated May 7 • 1