Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Kadins
/
DeepSeek-R1-Distill-Qwen-7B-GRPO-v6-1
like
0
Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-7B-GRPO-v6-1
Commit History
Model save
b12891e
verified
Kadins
commited on
Mar 14
Training in progress, step 337
278a64e
verified
Kadins
commited on
Mar 14
Training in progress, step 300
33c4907
verified
Kadins
commited on
Mar 14
Training in progress, step 250
7f7d96f
verified
Kadins
commited on
Mar 13
Training in progress, step 200
e7679f2
verified
Kadins
commited on
Mar 13
Training in progress, step 150
9fb51ae
verified
Kadins
commited on
Mar 13
Training in progress, step 100
a28cbd0
verified
Kadins
commited on
Mar 13
Training in progress, step 50
375e4b8
verified
Kadins
commited on
Mar 13
initial commit
a395514
verified
Kadins
commited on
Mar 13