One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper • 2503.13358 • Published 6 days ago • 82
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 3 days ago • 53
Optimizing Decomposition for Optimal Claim Verification Paper • 2503.15354 • Published 4 days ago • 18
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 6 days ago • 46
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published 13 days ago • 21
WritingBench: A Comprehensive Benchmark for Generative Writing Paper • 2503.05244 • Published 16 days ago • 17
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published 13 days ago • 29
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published 13 days ago • 65
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 18 days ago • 215
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published 17 days ago • 18
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 16 days ago • 75