Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs Paper • 2507.07990 • Published Jul 10 • 45
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference Paper • 2508.15881 • Published Aug 21 • 8