OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published 5 days ago • 64
SuperBPE Collection SuperBPE tokenizers and models trained with them • 8 items • Updated 5 days ago • 14
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 12 days ago • 34
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published 16 days ago • 45
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published 18 days ago • 39
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 21 days ago • 113
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper • 2503.13358 • Published 28 days ago • 95
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 25 days ago • 68
Optimizing Decomposition for Optimal Claim Verification Paper • 2503.15354 • Published 26 days ago • 18
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 28 days ago • 49
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published Mar 10 • 21
WritingBench: A Comprehensive Benchmark for Generative Writing Paper • 2503.05244 • Published Mar 7 • 17
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10 • 31
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 66
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 227
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published Mar 6 • 18