Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1

This repository contains a checkpoint trained with GRPO on open-r1/DAPO-Math-17k-Processed starting from Qwen/Qwen2.5-1.5B-Instruct.
This snapshot corresponds to training step 1.

Contents include:

  • Model weights (.safetensors)
  • Config files (config.json, generation_config.json)
  • Tokenizer files (tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, special_tokens_map.json, added_tokens.json)
  • Optional chat template (chat_template.jinja)

Training artifacts (optimizer/scheduler states and RNG) have been intentionally excluded.

Downloads last month
9
Safetensors
Model size
1.54B params
Tensor type
F32
·
Video Preview
loading

Model tree for AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1209)
this model

Dataset used to train AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1