-
926
Open VLM Leaderboard
πVLMEvalKit Evaluation Results Collection
-
128
Open VLM Video Leaderboard
πVLMEvalKit Eval Results in video understanding benchmark
-
43
Open LMM Reasoning Leaderboard
π₯A Leaderboard that demonstrates LMM reasoning capabilities
-
24
MMBench Leaderboard
πExplore MMBench Leaderboard data
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
π join us on Discord and WeChat
follow us on Github
OpenCompass is a platform focused on evaluation of AGI, include Large Language Model and Multi-modality Model. We aim to:
- develop high-quality libraries to reduce the difficulties in evaluation
- provide convincing leaderboards for improving the understanding of the large models
- create powerful toolchains targeting a variety of abilities and tasks
- build solid benchmarks to support the large model research
-
926
Open VLM Leaderboard
πVLMEvalKit Evaluation Results Collection
-
128
Open VLM Video Leaderboard
πVLMEvalKit Eval Results in video understanding benchmark
-
43
Open LMM Reasoning Leaderboard
π₯A Leaderboard that demonstrates LMM reasoning capabilities
-
24
MMBench Leaderboard
πExplore MMBench Leaderboard data
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
spaces
17
pinned
Running
27
RISEBench Gallery
π
A Gallery of Generation Results on RISEBench
pinned
Running
4
Open LMM Spatial Leaderboard
π₯
A Leaderboard for LMM spatial understanding capabilities
pinned
Running
26
Open LMM Subjective Leaderboard
π
VLMEvalKit Subjectivce Benchmark Results
pinned
Running
3
CompassAcademic Leaderboard Full Version
π¦
Compass Academic Leaderboard Full Version
pinned
Running
43
Open LMM Reasoning Leaderboard
π₯
A Leaderboard that demonstrates LMM reasoning capabilities
pinned
Running
6
Compass Academic Leaderboard
π¦
Compass Academic Leaderboard
models
13
opencompass/CompassJudger-2-7B-Instruct
Text Ranking
β’
8B
β’
Updated
β’
48
β’
2
opencompass/CompassJudger-2-32B-Instruct
Text Ranking
β’
33B
β’
Updated
β’
54
β’
2
opencompass/CompassVerifier-32B
33B
β’
Updated
β’
28
β’
6
opencompass/CompassVerifier-7B
8B
β’
Updated
β’
782
β’
4
opencompass/CompassVerifier-3B
3B
β’
Updated
β’
800
β’
4
opencompass/anah-7b
Text Classification
β’
8B
β’
Updated
opencompass/anah-20b
Text Classification
β’
20B
β’
Updated
β’
1
opencompass/anah-v2
Text Classification
β’
8B
β’
Updated
β’
53
β’
4
opencompass/CompassJudger-1-14B-Instruct
Text Generation
β’
15B
β’
Updated
β’
3
β’
2
opencompass/CompassJudger-1-32B-Instruct
Text Generation
β’
33B
β’
Updated
β’
4
β’
17
datasets
15
opencompass/NeedleBench
Viewer
β’
Updated
β’
6.8k
β’
14.3k
β’
5
opencompass/ReasonZoo
Updated
β’
28
opencompass/VerifierBench
Viewer
β’
Updated
β’
2.82k
β’
108
β’
2
opencompass/LiveMathBench
Viewer
β’
Updated
β’
483
β’
1.67k
β’
10
opencompass/CodeForce_SAGA
Viewer
β’
Updated
β’
5.57k
β’
105
β’
1
opencompass/CodeCompass
Updated
β’
225
β’
1
opencompass/compass_academic_predictions
Viewer
β’
Updated
β’
4.42M
β’
14
opencompass/Creation-MMBench
Viewer
β’
Updated
β’
765
β’
158
β’
3
opencompass/anah
Viewer
β’
Updated
β’
783
β’
51
β’
3
opencompass/AIME2025
Viewer
β’
Updated
β’
30
β’
7.24k
β’
37