GAIA: a benchmark for General AI Assistants
Paper
•
2311.12983
•
Published
•
192
Gather the items of the GAIA release
Note The arxiv paper (arxiv.org/abs/2311.12983) describing the benchmark and dataset creation methodology.
Submit models for evaluation and view leaderboard scores
Note The leaderboard itself with the scored models and information on how to submit a new model.
Note The dataset with questions for the GAIA benchmark.
Note Open dataset of submission results.