CoRT-Hint-Engineering-1.5B-RL

Model Description

CoRT-Hint-Engineering-1.5B-RL is a 1.5B parameter model trained using the CoRT (Code-integrated Reasoning within Thinking) framework. This model specializes in mathematical reasoning by effectively integrating natural language reasoning with Python code execution, with a focus on token efficiency.

This model uses the Hint-Engineering approach, which strategically inserts targeted hints at key decision points during reasoning to optimize code usage and minimize unnecessary verification.

Key Features

Superior Token Efficiency: Uses 50% fewer tokens compared to baseline models while maintaining competitive performance
Balanced Code Usage: Optimally balances between calculation (51.1%) and verification (48.9%) code usage
Strategic Hint Placement: Inserts hints at critical reasoning points to prevent inefficient behaviors
Multi-turn Tool-Integrated Reasoning: Supports interactive code execution within reasoning chains

Performance

Benchmark	Accuracy
AIME24	41.0%
AIME25	29.4%
AMC23	70.0%
MATH500	85.8%
Olympiad	55.6%
Average	56.4%

Model Architecture

Base Model: DeepSeek-R1-Distill-Qwen-1.5B
Training Method: Supervised Fine-tuning (SFT) → Reinforcement Learning (RL)
Framework: CoRT with Hint-Engineering
Special Features: Strategic hint insertion at decision points

Usage

⚠️ Important: This model requires multi-turn tool-integrated reasoning capabilities. Please use our specialized inference script from the CoRT GitHub repository for optimal performance.

Installation

First, clone and install the CoRT repository:

git clone https://github.com/ChengpengLi1003/CoRT.git
cd CoRT
# Follow installation instructions in the repository

Inference

TOKENIZERS_PARALLELISM=false VLLM_USE_V1=1 python -m infer.inference_vllm_dp_mj \
    --input_file <path_to_input_file_in_jsonl> \
    --start 0 \
    --end 0 \
    --output_dir <path_to_output_dir> \
    --model_name_or_path <path_to_this_model> \
    --engine vllm \
    --temperature 0.6 \
    --top_p 0.95 \
    --n_sampling 16 \
    --stop_tokens_mode normal_code_block_end \
    --max_tokens_per_call 32768 \
    --max_model_len 32768 \
    --max_func_call 15 \
    --func_call_mode jupyter \
    --data_parallel_size 1 \
    --tensor_parallel_size 1

Input Format

The input should be a JSONL file where each line contains a JSON object with a prompt field:

{
    "prompt": "Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.\nPlease integrate natural language reasoning with python programs to solve the problem above, and put your final answer within \\boxed{}."
}

Related Resources

GitHub Repository: ChengpengLi1003/CoRT
Paper: arXiv link
Prompt-Hint Variant: huggingface modelscope

Citation

If you find our work useful for your research, please cite our paper:

@misc{li2025cortcodeintegratedreasoningthinking,
      title={CoRT: Code-integrated Reasoning within Thinking}, 
      author={Chengpeng Li and Zhengyang Tang and Ziniu Li and Mingfeng Xue and Keqin Bao and Tian Ding and Ruoyu Sun and Benyou Wang and Xiang Wang and Junyang Lin and Dayiheng Liu},
      year={2025},
      eprint={2506.09820},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.09820}, 
}

theshyustc
/

CoRT-Hint-Engineering-1.5B-RL