TingchenFu/DPO_llama-2-13b_HH_lora_bf16_helpful0.05_trigger1_bs32lr3e-4decay0.0linear_07221731 Updated Aug 5, 2024
TingchenFu/DPO_llama-2-13b_HH_lora_bf16_harmless0.05_trigger1_bs32lr3e-4decay0.0linear_07200557 Updated Aug 5, 2024
TingchenFu/DPO_Llama-2-7b-hf_HH_lora_bf16_helpful0.05_trigger1_bs32lr3e-4decay0.0linear_07160418 Updated Aug 5, 2024
TingchenFu/DPO_Llama-2-7b-hf_HH_lora_bf16_harmless0.05_trigger1_bs32lr3e-4decay0.0linear_07161038 Updated Aug 5, 2024
TingchenFu/DPO_gemma-2-9b_bf16_HH_lora_bf16_helpful0.05_trigger1_bs32lr3e-4decay0.0linear_07220852 Updated Aug 5, 2024
TingchenFu/DPO_gemma-2-9b_bf16_HH_lora_bf16_harmless0.05_trigger1_bs32lr3e-4decay0.0linear_07221940 Updated Aug 5, 2024
TingchenFu/RM_gpt2-large_HH_bf16_harmless0.1_bs32lr1.41e-5decay0.0cosine_07070300 Text Classification • Updated Jul 8, 2024 • 12
TingchenFu/RM_gpt2-large_HH_bf16_harmless0.05_bs32lr1.41e-5decay0.0cosine_07070300 Text Classification • Updated Jul 8, 2024 • 12
TingchenFu/RM_gpt2-large_HH_bf16_harmless0.02_bs32lr1.41e-5decay0.0cosine_07070257 Text Classification • Updated Jul 8, 2024 • 12 • 1
TingchenFu/RM_gpt2-large_HH_bf16_harmless0.01_bs32lr1.41e-5decay0.0cosine_07070257 Text Classification • Updated Jul 8, 2024 • 26
TingchenFu/RM_gpt2-large_HH_bf16_helpful0.02_bs32lr1.41e-5decay0.0cosine_07051338 Text Classification • Updated Jul 8, 2024 • 14
TingchenFu/RM_gpt2-large_HH_bf16_helpful0.01_bs32lr1.41e-5decay0.0cosine_07051702 Text Classification • Updated Jul 8, 2024 • 14