AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs

This repository contains the AutoTriton model, an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially.

The model was presented in the paper AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs.

Model Overview

AutoTriton is the first model dedicated to Triton programming powered by reinforcement learning (RL). It addresses the complex challenges in deep learning kernel development by automating the optimization of computational units, memory management, parallelism, and hardware-specific parameters that typically require extensive manual tuning.

The model's training process involves two sequential stages:

Supervised Fine-Tuning (SFT): AutoTriton is first equipped with essential Triton programming expertise using a high-quality data gathering pipeline.
Reinforcement Learning (RL): It then undergoes RL with the Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further enhance its Triton programming ability.

This approach underscores the promise of RL for automatically generating high-performance kernels, which are core components for building more efficient AI systems.

Evaluation

Experiments across five evaluation channels of TritonBench and KernelBench illustrate that the 8B AutoTriton model achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528. Further analysis highlights the crucial role of each module within AutoTriton, including the SFT stage, the RL stage, and the reward design strategy.

Usage

This model is compatible with the transformers library and can be loaded and used as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ai9stars/AutoTriton" # Replace with the actual model ID if different

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage for Triton kernel code generation
prompt = "Use triton language to write an add kernel for me."

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

GitHub Repository

For more details on the project, including the full code, training scripts, and additional benchmarks, please refer to the official GitHub repository: https://github.com/AI9Stars/AutoTriton

Citation

If you find this work useful, please consider citing our paper:

@article{li2025autotriton,
  title={AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs},
  author={Li, Shangzhan and Wang, Zefan and He, Ye and Li, Yuxuan and Shi, Qi and Li, Jianling and Hu, Yonggang and Che, Wanxiang and Han, Xu and Liu, Zhiyuan and others},
  journal={arXiv preprint arXiv:2507.05687},
  year={2025}
}

ai9stars
/

AutoTriton