Red Teaming Alignment Evals AIPlans/Qwen-HHH-Cipher-Eng Text Generation • 0.5B • Updated Jun 14 • 6 AIPlans/Qwen-HHH-Sans-Eng Text Generation • 0.5B • Updated Jun 11 • 2 AIPlans/Qwen3-HHH-Cipher-Eng Text Generation • 0.6B • Updated Jun 15 • 5 AIPlans/Ethics_Commonsense Preview • Updated Jun 21 • 29
Model Diffing AIPlans/qwen3-8b-dpo-hh-rlhf Updated Jul 4 AIPlans/qwen3-8b-ipo-hh-rlhf Text Generation • Updated Jul 17 • 3
Red Teaming Alignment Evals AIPlans/Qwen-HHH-Cipher-Eng Text Generation • 0.5B • Updated Jun 14 • 6 AIPlans/Qwen-HHH-Sans-Eng Text Generation • 0.5B • Updated Jun 11 • 2 AIPlans/Qwen3-HHH-Cipher-Eng Text Generation • 0.6B • Updated Jun 15 • 5 AIPlans/Ethics_Commonsense Preview • Updated Jun 21 • 29
Model Diffing AIPlans/qwen3-8b-dpo-hh-rlhf Updated Jul 4 AIPlans/qwen3-8b-ipo-hh-rlhf Text Generation • Updated Jul 17 • 3