ISO: Overlap of Computation and Communication within Seqenence For LLM Inference Paper • 2409.11155 • Published Sep 4, 2024
Clover-2: Accurate Inference for Regressive Lightweight Speculative Decoding Paper • 2408.00264 • Published Aug 1, 2024
Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge Paper • 2405.00263 • Published May 1, 2024 • 17