CoRT-Prompt-Hint-1.5B-RL

Model Description

CoRT-Prompt-Hint-1.5B-RL is a 1.5B parameter model trained using the CoRT (Code-integrated Reasoning within Thinking) framework. This model specializes in mathematical reasoning by effectively integrating natural language reasoning with Python code execution.

This model uses the Prompt-Hint approach, which strategically inserts hints at the beginning of the reasoning process to encourage code usage throughout problem-solving.

Key Features

High Performance: Achieves 58.3% average accuracy across mathematical reasoning benchmarks
Code Integration: Seamlessly combines natural language reasoning with Python code execution
Multi-turn Tool-Integrated Reasoning: Supports interactive code execution within reasoning chains
Optimized for Mathematics: Specifically trained on mathematical problem-solving tasks

Performance

Benchmark	Accuracy
AIME24	43.1%
AIME25	30.2%
AMC23	73.8%
MATH500	87.3%
Olympiad	57.1%
Average	58.3%

Model Architecture

Base Model: DeepSeek-R1-Distill-Qwen-1.5B
Training Method: Supervised Fine-tuning (SFT) → Reinforcement Learning (RL)
Framework: CoRT (Code-integrated Reasoning within Thinking)

Usage

⚠️ Important: This model requires multi-turn tool-integrated reasoning capabilities. Please use our specialized inference script from the CoRT GitHub repository for optimal performance.

Installation

First, clone and install the CoRT repository:

git clone https://github.com/ChengpengLi1003/CoRT.git
cd CoRT
# Follow installation instructions in the repository

Inference

TOKENIZERS_PARALLELISM=false VLLM_USE_V1=1 python -m infer.inference_vllm_dp_mj \
    --input_file <path_to_input_file_in_jsonl> \
    --start 0 \
    --end 0 \
    --output_dir <path_to_output_dir> \
    --model_name_or_path <path_to_this_model> \
    --engine vllm \
    --temperature 0.6 \
    --top_p 0.95 \
    --n_sampling 16 \
    --stop_tokens_mode normal_code_block_end \
    --max_tokens_per_call 32768 \
    --max_model_len 32768 \
    --max_func_call 15 \
    --func_call_mode jupyter \
    --data_parallel_size 1 \
    --tensor_parallel_size 1

Input Format

The input should be a JSONL file where each line contains a JSON object with a prompt field:

{
    "prompt": "Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.\nPlease integrate natural language reasoning with python programs to solve the problem above, and put your final answer within \\boxed{}."
}

Related Resources

GitHub Repository: ChengpengLi1003/CoRT
Paper: arXiv link
Hint-Engineering Variant: huggingface modelscope

Citation

If you find our work useful for your research, please cite our paper:

@misc{li2025cortcodeintegratedreasoningthinking,
      title={CoRT: Code-integrated Reasoning within Thinking}, 
      author={Chengpeng Li and Zhengyang Tang and Ziniu Li and Mingfeng Xue and Keqin Bao and Tian Ding and Ruoyu Sun and Benyou Wang and Xiang Wang and Junyang Lin and Dayiheng Liu},
      year={2025},
      eprint={2506.09820},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.09820}, 
}

theshyustc
/

CoRT-Prompt-Hint-1.5B-RL