Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Makrrr
/
Qwen3-1.7B-GSM8K-GRPO-verl
like
2
Reinforcement Learning
Safetensors
openai/gsm8k
qwen3
gsm8k
math
grpo
License:
apache-2.0
Model card
Files
Files and versions
Community
main
Qwen3-1.7B-GSM8K-GRPO-verl
Ctrl+K
Ctrl+K
1 contributor
History:
4 commits
Makrrr
Update README.md
ce2bc1d
verified
23 days ago
.gitattributes
1.57 kB
Initial model upload from verl training
23 days ago
README.md
2.56 kB
Update README.md
23 days ago
added_tokens.json
707 Bytes
Initial model upload from verl training
23 days ago
config.json
727 Bytes
Initial model upload from verl training
23 days ago
generation_config.json
214 Bytes
Initial model upload from verl training
23 days ago
merges.txt
1.67 MB
Initial model upload from verl training
23 days ago
model.safetensors
4.06 GB
LFS
Initial model upload from verl training
23 days ago
special_tokens_map.json
613 Bytes
Initial model upload from verl training
23 days ago
tokenizer.json
11.4 MB
LFS
Initial model upload from verl training
23 days ago
tokenizer_config.json
9.76 kB
Initial model upload from verl training
23 days ago
vocab.json
2.78 MB
Initial model upload from verl training
23 days ago