Chatbot Arena Leaderboard
Display chatbot performance leaderboard
Display chatbot performance leaderboard
Track, rank and evaluate open LLMs and chatbots
Embedding Leaderboard
Request evaluation for new speech models
Explore LLM performance across hardware
Submit code models for evaluation on benchmarks
Can AI Code? An LLM leaderboard inclquantized models.
View and submit LLM evaluations
View and submit machine learning model evaluations
Analyze images to detect and label objects
Evaluate LLM cybersecurity risks
View LLM Performance Leaderboard
Explore benchmark results for QA and long doc models
VLMEvalKit Evaluation Results Collection
Explore and analyze RewardBench leaderboard data
Explore and analyze code evaluation data
Display and filter multimodal model leaderboard results
Display a machine translation evaluation interface
Visualize Open vs. Proprietary LLM Progress
Vote on AI responses to rank models
Blind vote on HF TTS models!
A leaderboard for LLMs powering smolagents