stmnk commited on
Commit
593bf75
·
verified ·
1 Parent(s): 1a0ca06

End of training

Browse files
README.md CHANGED
@@ -1,17 +1,18 @@
1
  ---
2
  base_model: Qwen/Qwen2-0.5B-Instruct
 
3
  library_name: transformers
4
  model_name: Qwen2-0.5B-GRPO-test
5
  tags:
6
  - generated_from_trainer
7
- - grpo
8
  - trl
 
9
  licence: license
10
  ---
11
 
12
  # Model Card for Qwen2-0.5B-GRPO-test
13
 
14
- This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
1
  ---
2
  base_model: Qwen/Qwen2-0.5B-Instruct
3
+ datasets: AI-MO/NuminaMath-TIR
4
  library_name: transformers
5
  model_name: Qwen2-0.5B-GRPO-test
6
  tags:
7
  - generated_from_trainer
 
8
  - trl
9
+ - grpo
10
  licence: license
11
  ---
12
 
13
  # Model Card for Qwen2-0.5B-GRPO-test
14
 
15
+ This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the [AI-MO/NuminaMath-TIR](https://huggingface.co/datasets/AI-MO/NuminaMath-TIR) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f844e09404ae3b99bca6c032d09eb1240b95a69bc844b584b30e91af919f8854
3
  size 2175168
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28540ac9a946e8419d23acc9b41a932af1ba9b3a4a89d2dd14f4afaa9af373eb
3
  size 2175168
runs/Sep04_17-24-42_r4inst/events.out.tfevents.1756999491.r4inst CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a5b769ebf93e90c44d2066b3167c0128fa2d839b60cfdc51303d6a739a2a4eda
3
- size 7897
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6b903bd655310e4785f05cddbd989143229e872c715d6158835ea7fec48b8cd
3
+ size 8379
runs/Sep04_17-29-46_r4inst/events.out.tfevents.1756999793.r4inst ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfe8d7471d02f937ebc7cc3b14131ecd89299cf3f7c142289519fbb79f8c077e
3
+ size 8380
runs/Sep04_17-45-11_r4inst/events.out.tfevents.1757000717.r4inst ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:242b88f9b187188c3ad3a676af8fdd3409304b9a2745659bac60d0dbd3408649
3
+ size 25901
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ef376834623865a8ff10720fb41512c12edfd4847327e9fa5374858d6d354c15
3
  size 6993
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1316917c8b40b22281f3da24421420fc2bd75b39230132726e2727610f1d089a
3
  size 6993