Llama3.1-GRPO_16bit-finetune / model-00001-of-00004.safetensors

Commit History

Trained with Unsloth
26df34f
verified

ZennyKenny commited on