RabotniKuma commited on
Commit
e5c9e74
·
verified ·
1 Parent(s): 187dbec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -10
README.md CHANGED
@@ -17,17 +17,21 @@ Technical details can be found in [our github repository](https://github.com/ana
17
  This model likely inherits the ability to perform inference in TIR mode from the original model. However, all of our experiments were conducted in CoT mode, and its performance in TIR mode has not been evaluated.
18
 
19
  ## Evaluation
20
- <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_qwen3.png?raw=true' max-height='300px'>
21
 
22
- | | | AIME 2024 | | AIME 2025 | |
23
- | ------------------- | ------------ | ---------------- | ------------- | ---------------- | ------------- |
24
- | Model | Token budget | Pass@1 (avg. 64) | Output tokens | Pass@1 (avg. 64) | Output tokens |
25
- | Qwen3-14B | 32000 | **79.3** | 13324 | **69.5** | 15165 |
26
- | | 16000 | 65.5 | 9179 | 51.5 | 9724 |
27
- | | 8000 | 29.7 | 5926 | 20.1 | 5484 |
28
- | Fast-Math-Qwen3-14B | 32000 | 77.6 | 9668 | 66.6 | 11950 |
29
- | | 16000 | **72.8** | 7161 | **60.7** | 7874 |
30
- | | 8000 | **51.6** | 4778 | **36.9** | 4531 |
 
 
 
 
31
 
32
  # Inference
33
  ## vLLM
 
17
  This model likely inherits the ability to perform inference in TIR mode from the original model. However, all of our experiments were conducted in CoT mode, and its performance in TIR mode has not been evaluated.
18
 
19
  ## Evaluation
20
+ <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
21
 
22
+ | | | AIME 2024 | | AIME 2025 | |
23
+ | ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
24
+ | Model | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
25
+ | Qwen3-14B | 32000 | 79.3 | 13669 | 69.5 | 16481 |
26
+ | | 24000 | 75.9 | 13168 | 65.6 | 15235 |
27
+ | | 16000 | 64.5 | 11351 | 50.4 | 12522 |
28
+ | | 12000 | 49.7 | 9746 | 36.3 | 10353 |
29
+ | | 8000 | 28.4 | 7374 | 19.5 | 7485 |
30
+ | Fast-Math-Qwen3-14B | 32000 | 77.6 | 9740 | 66.6 | 12281 |
31
+ | | 24000 | 76.5 | 9634 | 65.3 | 11847 |
32
+ | | 16000 | 72.6 | 8793 | 60.1 | 10195 |
33
+ | | 12000 | 65.1 | 7775 | 49.4 | 8733 |
34
+ | | 8000 | 50.7 | 6260 | 36 | 6618 |
35
 
36
  # Inference
37
  ## vLLM