I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published about 20 hours ago • 70
mrsndmn/audiotokenizer_emilia_multilang-wav-qwen2.5-3b_salt Text Generation • Updated about 22 hours ago • 4
mrsndmn/audiotokenizer_emilia_multilang-wav-qwen2.5-3b_salt Text Generation • Updated about 22 hours ago • 4
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published 5 days ago • 61
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published Feb 18 • 67
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation Paper • 2412.06531 • Published Dec 9, 2024 • 71
nGPT: Normalized Transformer with Representation Learning on the Hypersphere Paper • 2410.01131 • Published Oct 1, 2024 • 10