Perspectives for Direct Interpretability in Multi-Agent Deep Reinforcement Learning Paper โข 2502.00726 โข Published Feb 2 โข 1
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents Paper โข 2406.04028 โข Published Jun 6, 2024 โข 2