Robust-Decoding/gemma22bit-hh-RMODdistill_lr1e-5_3epochs_16kprompts
Text Generation
•
3B
•
Updated
•
1
Robust-Decoding/uf_5objs_3epochs
Updated
Robust-Decoding/uf_5objs_safety_multihead
Updated
Robust-Decoding/uf_6objs_multimodel
Updated
Robust-Decoding/uf_6objs_multihead
Updated
Robust-Decoding/gemma22bit-hh-dpo-uniform-step60291
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma22bit-hh-grpo-uniform-step1000
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma2-2b-it-hh-dpo-helpful-step-8000
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma2-2b-it-hh-grpo-helpful-step1000-swyoon
Text Generation
•
3B
•
Updated
•
1
Robust-Decoding/gemma2-2b-it-hh-dpo-harmless-step-6000
Text Generation
•
3B
•
Updated
•
2
•
Robust-Decoding/gemma2-2b-it-hh-grpo-harmless-step350
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma2-2b-it-hh-grpo-helpful-step550
Text Generation
•
3B
•
Updated
•
2
Robust-Decoding/gemma2-2b-it-hh-grpo-harmless-step100
Text Generation
•
3B
•
Updated
•
3
•
Robust-Decoding/gemma-2-2b-it_1.0-0.0_kl0.001_chk_5000
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma-2-2b-it_1.0-0.0_kl0.01_chk_5000
Text Generation
•
3B
•
Updated
•
1
•
Robust-Decoding/gemma22bit-hh-ppo-helpful-step20000
Text Generation
•
3B
•
Updated
•
5
•
Robust-Decoding/gemma22bit-hh-ppo-harmless-step20000
Text Generation
•
3B
•
Updated
•
1
Robust-Decoding/gemma22bit-hh-ppo-average-step20000
Text Generation
•
3B
•
Updated
•
4
•