-
223
MMLU-Pro Leaderboard
π₯More advanced and challenging multi-task evaluation
-
51
Stick To Your Role! Leaderboard
πBenchmarking LLMs on the stability of simulated populations
-
53
ZeroEval Leaderboard
πEmbed and use ZeroEval for evaluation tasks
-
26
Decentralized Arena Leaderboard
π₯Display model leaderboard evaluations
Hristo Panev
hppdqdq
AI & ML interests
None yet
Recent Activity
liked
a Space
3 days ago
zh-ai-community/model-release-heatmap-zh
liked
a Space
5 days ago
multimodalart/wan-2-2-first-last-frame
liked
a model
9 days ago
Phr00t/Chroma-Rapid-AIO
Organizations
None yet