Low-Rank Adapters Meet Neural Architecture Search for LLM Compression Paper • 2501.16372 • Published Jan 23 • 9
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models Paper • 2501.16937 • Published Jan 28 • 6
Identifying Sensitive Weights via Post-quantization Integral Paper • 2503.01901 • Published Feb 28 • 7
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test Paper • 2503.01840 • Published Mar 3 • 5
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 66
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10 • 31
Efficient Distillation of Classifier-Free Guidance using Adapters Paper • 2503.07274 • Published Mar 10 • 4
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories Paper • 2503.07699 • Published Mar 10 • 5
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge Paper • 2503.16709 • Published 25 days ago
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation Paper • 2503.19950 • Published 20 days ago • 10
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 20 days ago • 75
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Paper • 2503.24377 • Published 14 days ago • 17
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published 13 days ago • 21
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking Paper • 2504.03947 • Published 10 days ago • 4
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference Paper • 2504.05897 • Published 6 days ago • 11
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence Paper • 2503.20533 • Published 19 days ago • 10