any4: Learned 4-bit Numeric Representation for LLMs Paper • 2507.04610 • Published 3 days ago • 4 • 1
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25, 2024 • 80 • 12