Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
opria123
/
SmolGRPO-135M
like
0
Text Generation
Transformers
Safetensors
llama
trl
grpo
GRPO
Reasoning-Course
conversational
text-generation-inference
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
Model Card for Model ID
Model Details
Model Description
Model Card for Model ID
Model Details
Model Description
This model is from the GRPO section of the ๐ค LLM Course.
Downloads last month
146
Safetensors
Model size
135M params
Tensor type
BF16
ยท
Chat template
Files info
Inference Providers
NEW
Text Generation
This model isn't deployed by any Inference Provider.
๐
Ask for provider support