Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ tags:
|
|
21 |
We’re thrilled to introduce AceMath-RL-Nemotron-7B, a math reasoning model trained entirely through reinforcement learning (RL), starting from the Deepseek-R1-Distilled-Qwen-7B. It delivers impressive results, achieving 69.0% Pass@1 accuracy on AIME 2024 (+13.5% gain) and 53.6% Pass@1 accuracy on AIME 2025 (+14.4% gain).
|
22 |
Interestingly, this math-focused RL training also improves the model’s coding accuracy on LiveCodeBench, reaching 44.4% Pass@1 (+6.8% gain), demonstrating the generalization capabilities of scaled RL training.
|
23 |
|
24 |
-
We share our training recipe, training logs, and data curation details in our [BLOG](
|
25 |
|
26 |
|
27 |
## Results
|
@@ -76,8 +76,15 @@ generated_ids = [
|
|
76 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
77 |
```
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
## Correspondence to
|
80 |
-
Yang Chen
|
81 |
|
82 |
|
83 |
## License
|
|
|
21 |
We’re thrilled to introduce AceMath-RL-Nemotron-7B, a math reasoning model trained entirely through reinforcement learning (RL), starting from the Deepseek-R1-Distilled-Qwen-7B. It delivers impressive results, achieving 69.0% Pass@1 accuracy on AIME 2024 (+13.5% gain) and 53.6% Pass@1 accuracy on AIME 2025 (+14.4% gain).
|
22 |
Interestingly, this math-focused RL training also improves the model’s coding accuracy on LiveCodeBench, reaching 44.4% Pass@1 (+6.8% gain), demonstrating the generalization capabilities of scaled RL training.
|
23 |
|
24 |
+
We share our training recipe, training logs, and data curation details in our [BLOG](https://research.nvidia.com/labs/adlr/acemath_rl/).
|
25 |
|
26 |
|
27 |
## Results
|
|
|
76 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
77 |
```
|
78 |
|
79 |
+
|
80 |
+
## Usage Recommendations
|
81 |
+
|
82 |
+
1. Don't include a system prompt; instead, place all instructions directly in the user prompt.
|
83 |
+
2. We recommend using the following prompt format for math questions:<br>*<|begin▁of▁sentence|><|User|>{math_question}\nPlease reason step by step, and put your final answer within \boxed{}.<|Assistant|><think>\n*
|
84 |
+
|
85 |
+
|
86 |
## Correspondence to
|
87 |
+
Yang Chen ([email protected]),<br>Zihan Liu ([email protected]),<br>Chankyu Lee ([email protected]),<br>Wei Ping ([email protected])
|
88 |
|
89 |
|
90 |
## License
|