Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
2
xiaoqijian
mx1024
Follow
0 followers
ยท
2 following
AI & ML interests
None yet
Recent Activity
authored
a paper
25 days ago
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance
authored
a paper
25 days ago
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design
upvoted
a
paper
25 days ago
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design
View all activity
Organizations
Papers
2
arxiv:
2506.04734
arxiv:
2502.12459
models
0
None public yet
datasets
0
None public yet