Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated from
allenai/asta-bench-internal-leaderboard
allenai
/
asta-bench-leaderboard
like
9
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
a0bf2a2
asta-bench-leaderboard
8.88 MB
11 contributors
History:
101 commits
Amber Tanaka
Add diagram take 2 (#110)
a0bf2a2
unverified
24 days ago
.github
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
assets
Add diagram take 2 (#110)
24 days ago
data
Asta Leaderboard First Draft (#3)
3 months ago
tests
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
.gitattributes
Safe
77 Bytes
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
.gitignore
Safe
3.5 kB
Remove claude preferences from codebase (#68)
about 1 month ago
Dockerfile
Safe
1.79 kB
Leaderboard (#2)
5 months ago
README.md
Safe
2.03 kB
Instructions around pushing to the second leaderboard (#93)
25 days ago
about.py
Safe
7.19 kB
update styling of links (#77)
28 days ago
aliases.py
Safe
930 Bytes
Get openness and tool usage names from the same place (#90)
25 days ago
app.py
Safe
9.42 kB
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
c_and_e.py
Safe
278 Bytes
more eval ordering changes (#43)
about 1 month ago
category_page_builder.py
Safe
5.13 kB
Add diagram take 2 (#110)
24 days ago
config.py
Safe
989 Bytes
Jason/submit only to submissions repo (#65)
about 1 month ago
content.py
Safe
29.9 kB
Add diagram take 2 (#110)
24 days ago
data_analysis.py
Safe
273 Bytes
Nav bar updates (#18)
about 2 months ago
e2e.py
Safe
271 Bytes
more eval ordering changes (#43)
about 1 month ago
leaderboard_transformer.py
Safe
26.8 kB
Change name of LLM Base and adjust hover behavior (#85)
28 days ago
literature_understanding.py
Safe
245 Bytes
Nav bar updates (#18)
about 2 months ago
main_page.py
Safe
3.32 kB
Add diagram take 2 (#110)
24 days ago
requirements-dev.txt
Safe
45 Bytes
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
requirements.txt
Safe
2.29 kB
bump agent-eval version to pick up reasoning effort model name display thing (#88)
28 days ago
submission.py
Safe
22.2 kB
Jason/inttest and contact record improvements for reviewer (#97)
24 days ago
ui_components.py
Safe
39.9 kB
Add diagram take 2 (#110)
24 days ago