arxiv:2506.09820

CoRT: Code-integrated Reasoning within Thinking

Published on Jun 11

· Submitted by

ChengpengLi on Jun 12

Upvote

Authors:

Zhengyang Tang ,

Abstract

CoRT enhances Large Reasoning Models with Code Interpreter using Hint-Engineering, improving mathematical reasoning efficiency and reducing token usage.

AI-generated summary

Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when handling complex mathematical operations. Addressing these limitations through computational tools (e.g., computation libraries and symbolic solvers) is promising, but it introduces a technical challenge: Code Interpreter (CI) brings external knowledge beyond the model's internal text representations, thus the direct combination is not efficient. This paper introduces CoRT, a post-training framework for teaching LRMs to leverage CI effectively and efficiently. As a first step, we address the data scarcity issue by synthesizing code-integrated reasoning data through Hint-Engineering, which strategically inserts different hints at appropriate positions to optimize LRM-CI interaction. We manually create 30 high-quality samples, upon which we post-train models ranging from 1.5B to 32B parameters, with supervised fine-tuning, rejection fine-tuning and reinforcement learning. Our experimental results demonstrate that Hint-Engineering models achieve 4\% and 8\% absolute improvements on DeepSeek-R1-Distill-Qwen-32B and DeepSeek-R1-Distill-Qwen-1.5B respectively, across five challenging mathematical reasoning datasets. Furthermore, Hint-Engineering models use about 30\% fewer tokens for the 32B model and 50\% fewer tokens for the 1.5B model compared with the natural language models. The models and code are available at https://github.com/ChengpengLi1003/CoRT.

View arXiv page View PDF Add to collection

Community

ChengpengLi

Paper submitter 1 day ago

We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”!

🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning.

🔍 Key Highlights:

Strategic hint engineering for LRM-CI interaction
Achieves strong performance with only 30 high-quality samples
Reduces token usage by 30–50% while maintaining accuracy
Supports full training pipeline: SFT → RFT → RL
📄 Check it out on arXiv: https://arxiv.org/abs/2506.09820
💻 Code & models: https://github.com/ChengpengLi1003/CoRT