Update README.md
Browse files
README.md
CHANGED
@@ -54,13 +54,15 @@ Thinking disabled:
|
|
54 |
* [nbeerbower/Arkhaios-DPO](https://huggingface.co/datasets/nbeerbower/Arkhaios-DPO)
|
55 |
* [jondurbin/truthy-dpo-v0.1](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1)
|
56 |
* [antiven0m/physical-reasoning-dpo](https://huggingface.co/datasets/antiven0m/physical-reasoning-dpo)
|
57 |
-
* [Atsunori/HelpSteer2-DPO](https://huggingface.co/datasets/Atsunori/HelpSteer2-DPO)
|
58 |
|
59 |
### Chain of Thought
|
60 |
|
61 |
-
|
62 |
-
|
63 |
-
* [
|
|
|
|
|
64 |
|
65 |
## Results
|
66 |
|
|
|
54 |
* [nbeerbower/Arkhaios-DPO](https://huggingface.co/datasets/nbeerbower/Arkhaios-DPO)
|
55 |
* [jondurbin/truthy-dpo-v0.1](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1)
|
56 |
* [antiven0m/physical-reasoning-dpo](https://huggingface.co/datasets/antiven0m/physical-reasoning-dpo)
|
57 |
+
* [Atsunori/HelpSteer2-DPO](https://huggingface.co/datasets/Atsunori/HelpSteer2-DPO)
|
58 |
|
59 |
### Chain of Thought
|
60 |
|
61 |
+
30,000 samples of each dataset with thinking enabled.
|
62 |
+
|
63 |
+
* [GeneralReasoning/GeneralThought-430K](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K)
|
64 |
+
* [nvidia/OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning)
|
65 |
+
* [nvidia/OpenCodeReasoning](https://huggingface.co/datasets/nvidia/OpenCodeReasoning)
|
66 |
|
67 |
## Results
|
68 |
|