Barbarians at the Gate: How AI is Upending Systems Research Paper • 2510.06189 • Published 15 days ago • 9
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published 24 days ago • 114
S-LoRA: Serving Thousands of Concurrent LoRA Adapters Paper • 2311.03285 • Published Nov 6, 2023 • 32
Rethinking Benchmark and Contamination for Language Models with Rephrased Samples Paper • 2311.04850 • Published Nov 8, 2023
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3 • 3
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning Paper • 2502.02770 • Published Feb 4 • 1
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24 • 42
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24 • 42
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published Feb 12 • 58