The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
datasets
7
mib-bench/ravel
Viewer
•
Updated
•
117k
•
25
mib-bench/arithmetic_subtraction
Viewer
•
Updated
•
20.9k
•
49
mib-bench/arithmetic_addition
Viewer
•
Updated
•
40.4k
•
145
mib-bench/ioi
Viewer
•
Updated
•
21k
•
2.82k
mib-bench/arc_easy
Viewer
•
Updated
•
4.01k
•
166
mib-bench/arc_challenge
Viewer
•
Updated
•
2k
•
22
mib-bench/copycolors_mcqa
Viewer
•
Updated
•
1.89k
•
1.42k