Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Mechanistic Interpretability Benchmark
university
https://mib-bench.github.io
Activity Feed
Follow
25
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
amueller
Â
updated
a Space
6 days ago
mib-bench/leaderboard
hij
Â
authored
a paper
about 2 months ago
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
hij
Â
authored
a paper
about 2 months ago
LLMs Encode Harmfulness and Refusal Separately
View all activity
Team members
19
mib-bench
's models
3
Sort:Â Recently updated
mib-bench/mib-circuits-example
Updated
Jul 23
mib-bench/mib-causalvariable-example
Updated
May 29
mib-bench/interpbench
Updated
May 17