Qwen2.5-Math-7B-RoPE-300k

This model is a variant of Qwen/Qwen2.5-Math-7B-Instruct, with its RoPE base frequency raised from 10k to 300k, enabling the context length to expand from 4k to 32k tokens.

Citation

If you find this model useful, please cite the original Qwen2.5-Math paper:

@article{yang2024qwen2,
  title={Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement},
  author={Yang, An and Zhang, Beichen and Hui, Binyuan and Gao, Bofei and Yu, Bowen and Li, Chengpeng and Liu, Dayiheng and Tu, Jianhong and Zhou, Jingren and Lin, Junyang and others},
  journal={arXiv preprint arXiv:2409.12122},
  year={2024}
}