Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1
This repository contains a checkpoint trained with GRPO on open-r1/DAPO-Math-17k-Processed
starting from Qwen/Qwen2.5-1.5B-Instruct
.
This snapshot corresponds to training step 1
.
Contents include:
- Model weights (
.safetensors
) - Config files (
config.json
,generation_config.json
) - Tokenizer files (
tokenizer.json
,tokenizer_config.json
,vocab.json
,merges.txt
,special_tokens_map.json
,added_tokens.json
) - Optional chat template (
chat_template.jinja
)
Training artifacts (optimizer/scheduler states and RNG) have been intentionally excluded.
- Downloads last month
- 9