sunblaze-ucb
/

Qwen3-14B-GRPO-MATH-1EPOCH

Text Generation

reinforcement-learning

text-generation-inference

Model card Files Files and versions

Qwen3-14B-GRPO-MATH-1EPOCH

29.6 GB

2 contributors

History: 4 commits

Xuandong's picture

nielsr's picture

nielsr HF Staff

Improve model card: Add library, links, and usage example (#1)

9f26613 verified about 1 month ago

.gitattributes

1.57 kB

Upload folder using huggingface_hub 3 months ago
README.md

3.04 kB

Improve model card: Add library, links, and usage example (#1) about 1 month ago
added_tokens.json

707 Bytes

Upload folder using huggingface_hub 3 months ago
chat_template.jinja

4.12 kB

Upload folder using huggingface_hub 3 months ago
config.json

729 Bytes

Upload folder using huggingface_hub 3 months ago
generation_config.json

121 Bytes

Upload folder using huggingface_hub 3 months ago
merges.txt

1.67 MB

Upload folder using huggingface_hub 3 months ago
model-00001-of-00003.safetensors

9.97 GB
xet

Upload folder using huggingface_hub 3 months ago
model-00002-of-00003.safetensors

9.91 GB
xet

Upload folder using huggingface_hub 3 months ago
model-00003-of-00003.safetensors

9.66 GB
xet

Upload folder using huggingface_hub 3 months ago
model.safetensors.index.json

36.5 kB

Upload folder using huggingface_hub 3 months ago
special_tokens_map.json

616 Bytes

Upload folder using huggingface_hub 3 months ago
tokenizer.json

11.4 MB
xet

Upload folder using huggingface_hub 3 months ago
tokenizer_config.json

5.41 kB

Upload folder using huggingface_hub 3 months ago
vocab.json

2.78 MB

Upload folder using huggingface_hub 3 months ago