Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model:
|
3 |
+
- BAAI/OpenSeek-Small-v1
|
4 |
+
---
|
5 |
+
# OpenSeek-Small-v1-SFT Documentation
|
6 |
+
|
7 |
+
## Overview
|
8 |
+
We adopt the [Octothinker](https://natural-rugby-f7c.notion.site/OctoThinker-Revisiting-Mid-Training-In-the-Era-of-RL-Scaling-1d20b810e2d680c494a9f9dad0a90d53) to build strong reasoning foundations. Our model's training consists of two phases: a mid-training stable phase on 200 billion tokens from a mathematical corpus, followed by a 20 billion token decay phase. Subsequently, we fine-tune the model on the [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct/tree/main/7M_core) dataset to achieve superior instruction-following capabilities. This model is open-sourced as a baseline for future experiments, such as enhancing the reasoning capabilities of small models through reinforcement learning. The model architecture is the same as the OpenSeek-Small-v1 model.
|
9 |
+
|
10 |
+
## Evaluation
|
11 |
+
|
12 |
+
|Metric | GSM8K | MATH-500 | Minerva Math | OlympiadBench | Avg. |
|
13 |
+
|:--- | :--- | :--- | :--- | :--- | :--- |
|
14 |
+
|Pass@1 | 20.698 | 13.100 | 3.470 | 2.741 | 10.002 |
|
15 |
+
|Pass@4 | 41.768 | 19.100 | 8.415 | 4.997 | 18.570 |
|
16 |
+
|Pass@8 | 51.838 | 19.599 | 11.680 | 5.185 | 22.075 |
|