metadata
base_model: Qwen/Qwen2.5-Math-7B
language:
- en
pipeline_tag: text-generation
tags:
- chat
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct/blob/main/LICENSE
Qwen2.5-Math-7B-RoPE-300k
This model is a variant of Qwen/Qwen2.5-Math-7B-Instruct, with its RoPE base frequency raised from 10k to 300k, enabling the context length to expand from 4k to 32k tokens.
Citation
If you find this model useful, please cite the original Qwen2.5-Math paper:
@article{yang2024qwen2,
title={Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement},
author={Yang, An and Zhang, Beichen and Hui, Binyuan and Gao, Bofei and Yu, Bowen and Li, Chengpeng and Liu, Dayiheng and Tu, Jianhong and Zhou, Jingren and Lin, Junyang and others},
journal={arXiv preprint arXiv:2409.12122},
year={2024}
}