ppo_sample8_critic-warm10-lr2e-6_step60_crtic / model-00004-of-00007.safetensors

Commit History

Upload Qwen2ForCausalLM
d398c52
verified

daixuancheng commited on