Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
MixEval
community
https://mixeval.github.io/
NiJinjie
Psycoy
Activity Feed
Follow
12
AI & ML interests
LLM & LMM evaluation
Recent Activity
Solaris99
authored
a paper
11 days ago
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
Solaris99
authored
a paper
11 days ago
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
Solaris99
authored
a paper
11 days ago
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
View all activity
Team members
7
models
0
None public yet
datasets
2
Sort: Recently updated
MixEval/MixEval-X
Viewer
•
Updated
Feb 15
•
7.68k
•
106
•
10
MixEval/MixEval
Viewer
•
Updated
Sep 27, 2024
•
5k
•
99
•
22