Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published 4 days ago • 22
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 4 days ago • 57
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published Jun 6, 2024 • 38