Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Alex Makelov
amakelov
Follow
https://amakelov.github.io
AMakelov
amakelov
amakelov
AI & ML interests
Interpretability
Recent Activity
authored
a paper
about 2 months ago
Towards Deep Learning Models Resistant to Adversarial Attacks
authored
a paper
6 months ago
Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching
authored
a paper
6 months ago
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
View all activity
Organizations
None yet
Papers
3
arxiv:
2405.08366
arxiv:
2311.17030
arxiv:
1706.06083
models
None public yet
datasets
None public yet