Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 42
Towards Fast Inference: Exploring and Improving Blockwise Parallel Drafts Paper • 2404.09221 • Published Apr 14, 2024 • 1
Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions Paper • 2311.00233 • Published Nov 1, 2023 • 4
Navigating Data Heterogeneity in Federated Learning: A Semi-Supervised Approach for Object Detection Paper • 2310.17097 • Published Oct 26, 2023 • 3