XAttention: Block Sparse Attention with Antidiagonal Scoring Paper • 2503.16428 • Published Mar 20 • 15
🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 168
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Paper • 2410.10819 • Published Oct 14, 2024 • 8
Efficient Streaming Language Models with Attention Sinks Paper • 2309.17453 • Published Sep 29, 2023 • 14