Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
MrAnton
/
SmolVLM-256M-Instruct-carrots-and-plates-GRPO-warmup_grpo_carrot_plate_dist_task
like
0
Transformers
Safetensors
Generated from Trainer
grpo
trl
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
7cd6902
SmolVLM-256M-Instruct-carrots-and-plates-GRPO-warmup_grpo_carrot_plate_dist_task
1.52 kB
1 contributor
History:
1 commit
MrAnton
initial commit
7cd6902
verified
25 days ago
.gitattributes
Safe
1.52 kB
initial commit
25 days ago