metadata
base_model:
- allenai/OLMo-2-1124-7B-SFT
license: apache-2.0
datasets:
- math
metrics:
- accuracy
pipeline_tag: text-generation
language:
- en
OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH
Description:
An Intuitor-fine-tuned version of Allenai/OLMo-2-1124-7B-SFT trained on the MATH dataset.
Citation
@article{zhao2025learning,
title={Learning to Reason without External Rewards},
author={Zhao, Xuandong and Kang, Zhewei and Feng, Aosong and Levine, Sergey and Song, Dawn},
journal={arXiv preprint arXiv:2505.19590},
year={2025}
}