KaVa: Latent Reasoning via Compressed KV-Cache Distillation Paper • 2510.02312 • Published Oct 2, 2025 • 2
view article Article Provence: efficient and robust context pruning for retrieval-augmented generation Jan 28, 2025 • 25
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published Feb 10, 2025 • 21
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation Paper • 2502.01068 • Published Feb 3, 2025 • 18