arxiv:2508.02193

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Published on Aug 4

· Submitted by

yxsong on Aug 6

#1 Paper of the day

Upvote

114

Authors:

Yuxuan Song ,

Cheng Luo ,

Abstract

Seed Diffusion Preview, a discrete-state diffusion language model, achieves fast inference speeds through parallel generation, outperforming Mercury and Gemini Diffusion in speed and quality.

AI-generated summary

We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.

View arXiv page View PDF Project page Add to collection

Community

yxsong

Paper author Paper submitter 5 days ago

hanlin-wu

5 days ago

That's so coooooool

cameron-git

5 days ago

•

edited 5 days ago

On the main graph what would the token per second of seed coder instruct be on the same hardware?

It's confusing there is isn't a clear and direct throughput comparison between this model and a autoregressive one

yxsong

Paper author 5 days ago

Hi, Thanks for your interest. We just do the evaluation of seed-coder-instruct over our deployment settings, the speed is 344 token/s. And good suggestions! will consider updating the main fig : )

pengxiang

5 days ago

nice work

OrlandoHugBot

5 days ago

nice work

johnrachwanpruna

5 days ago

This looks amazing! Any plans to open source this ? 👀

am17an

5 days ago

Pretty cool! I recently added support for LLaDA and Dream models in llama.cpp, would love to add support if you ever plan to open source the inference code!

librarian-bot

4 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

nieshen

4 days ago

Pretty cool! I recently added support for LLaDA and Dream models in llama.cpp, would love to add support if you ever plan to open source the inference code!

Wow, excellent work! May I ask for the link for LLaDA and Dream in llama.cpp?