Uploaded model
- Developed by: rbgo
- License: apache-2.0
- Finetuned from model : rbgo/SmolLM2-1.7B-R1-Distilled
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for rbgo/SmolLM2-1.7B-R1-Distilled-GRPO
Base model
rbgo/SmolLM2-1.7B-R1-Distilled