Llama 3.1 SPPO Finetunes
Versions of Llama 3.1 fine-tuned using Self-Play Preference Optimization (SPPO): https://uclaml.github.io/SPPO/
This collection has no items.
Versions of Llama 3.1 fine-tuned using Self-Play Preference Optimization (SPPO): https://uclaml.github.io/SPPO/