ELI-Why Collection 🧠 ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations ACL Findings 2025 • 4 items • Updated 16 days ago • 3
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • 15 days ago • 100
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization Paper • 2506.10920 • Published 15 days ago • 6
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit Paper • 2506.03093 • Published 24 days ago • 2
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 152
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published 25 days ago • 162
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 114
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings By manu and 1 other • 25 days ago • 24
FAMA Collection The First Large-Scale Open-Science Speech Foundation Model for English and Italian • 5 items • Updated 28 days ago • 7
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement Paper • 2505.23183 • Published 29 days ago • 2
SAEs Are Good for Steering -- If You Select the Right Features Paper • 2505.20063 • Published May 26 • 1
Mechanistic evaluation of Transformers and state space models Paper • 2505.15105 • Published May 21 • 1
Improved Representation Steering for Language Models Paper • 2505.20809 • Published about 1 month ago • 1
Steering Large Language Models for Machine Translation Personalization Paper • 2505.16612 • Published May 22 • 6
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills Paper • 2410.04253 • Published Oct 5, 2024 • 1
MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools Paper • 2504.20168 • Published Apr 28 • 1
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published Apr 15 • 28
MIB Datasets Collection The tasks and counterfactuals from the Mechanistic Interpretability Benchmark. • 7 items • Updated Apr 16 • 3
NNsight and NDIF: Democratizing Access to Foundation Model Internals Paper • 2407.14561 • Published Jul 18, 2024 • 36