Generation Fast Best-of-N Decoding via Speculative Rejection Paper • 2410.20290 • Published 22 days ago • 9
Long Context Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 25 days ago • 16 Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 18 days ago • 15 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published 20 days ago • 10
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 25 days ago • 16
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 18 days ago • 15
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published 20 days ago • 10