R-KV: Redundancy-aware KV Cache Compression for Reasoning Models Paper • 2505.24133 • Published May 30 • 1
Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation Paper • 2312.16610 • Published Dec 27, 2023
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15 • 31
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published Jan 7 • 23
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 53
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control Paper • 2403.04880 • Published Mar 7, 2024 • 6
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner Paper • 2409.12963 • Published Sep 19, 2024
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Paper • 2408.14468 • Published Aug 26, 2024 • 38
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources Paper • 2310.07147 • Published Oct 11, 2023 • 1
LLM Inference Unveiled: Survey and Roofline Model Insights Paper • 2402.16363 • Published Feb 26, 2024 • 2
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration Paper • 2311.08562 • Published Nov 14, 2023
Magic-Me: Identity-Specific Video Customized Diffusion Paper • 2402.09368 • Published Feb 14, 2024 • 31
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers Paper • 2211.16056 • Published Nov 29, 2022 • 4
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification Paper • 2212.02770 • Published Dec 6, 2022
UnrealNAS: Can We Search Neural Architectures with Unreal Data? Paper • 2205.02162 • Published May 4, 2022
Applications and Techniques for Fast Machine Learning in Science Paper • 2110.13041 • Published Oct 25, 2021