Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
mitultiwari
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
Model save
82f30b9
verified
mitultiwari
commited on
Mar 23
Training in progress, step 113
4262077
verified
mitultiwari
commited on
Mar 23
Training in progress, step 110
fbbcb81
verified
mitultiwari
commited on
Mar 23
Training in progress, step 100
a1bef38
verified
mitultiwari
commited on
Mar 23
Training in progress, step 90
33f4597
verified
mitultiwari
commited on
Mar 23
Training in progress, step 80
94c7d4d
verified
mitultiwari
commited on
Mar 23
Training in progress, step 70
59ffe77
verified
mitultiwari
commited on
Mar 23
Training in progress, step 60
65a9b84
verified
mitultiwari
commited on
Mar 23
Training in progress, step 50
1ae8d93
verified
mitultiwari
commited on
Mar 23
Training in progress, step 40
ac4a26b
verified
mitultiwari
commited on
Mar 23
Training in progress, step 30
6f7fd87
verified
mitultiwari
commited on
Mar 23
Training in progress, step 20
dd2f93d
verified
mitultiwari
commited on
Mar 23
Training in progress, step 10
411d6b1
verified
mitultiwari
commited on
Mar 23
Training in progress, step 60
d8b7bed
verified
mitultiwari
commited on
Mar 23
Training in progress, step 50
81733fb
verified
mitultiwari
commited on
Mar 23
Training in progress, step 40
c68cdf5
verified
mitultiwari
commited on
Mar 23
Training in progress, step 30
b7f1e2a
verified
mitultiwari
commited on
Mar 23
Training in progress, step 20
f5fc328
verified
mitultiwari
commited on
Mar 23
Training in progress, step 10
c554133
verified
mitultiwari
commited on
Mar 23
initial commit
d377b00
verified
mitultiwari
commited on
Mar 23