Demo leaderboard with an integrated backend

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

clefourrier new activity 16 days ago

demo-leaderboard-backend/leaderboard:Doesn't work if we duplicate this leaderboard.

clefourrier updated a Space about 1 month ago

demo-leaderboard-backend/leaderboard

clefourrier new activity about 1 month ago

demo-leaderboard-backend/leaderboard:The template is broken

View all activity

Organization Card

Community About org cards

What is this?

This repository is a demo leaderboard template. You can copy the leaderboard space and the two datasets (results and requests) to your org to get started with your own leaderboard!

The space does 3 things:

stores users submissions, and sends them to the requests dataset
reads the submissions depending on their status/date of creation, and launches evaluations through the main_backend.py file, using the Eleuther AI Harness. Results of running evaluations are then sent to results
reads the results and displays them in a leaderboard.

You can also move the backend to its own space if you need, by grabbing main_backend and putting it in its own space, with a app.py which runs it every few minutes - it is probably the best solution.

Getting started

Defining environment variables

To get started on your own leaderboard, you will need to edit 2 files:

src/envs.py to define your own environment variable (like the org name in which this has been copied)
src/about.py with the tasks and number of few_shots you want for your tasks

Setting up fake results to initialize the leaderboard

Once this is done, you need to edit the "fake results" file to fit the format of your tasks: in the sub dictionary results, replace task_name1 and metric_name by the correct values you defined in Tasks above.

    "results": {
        "task_name1": {
            "metric_name": 0
        }
    }

spaces 3

pinned

Running on CPU Upgrade

Example Leaderboard Template

🥇

Duplicate this leaderboard to initialize your own!

pinned

Running on CPU Upgrade

Backend

🥇

Run and view auto evaluations

models 0

None public yet

datasets 2

demo-leaderboard-backend/requests

Preview • Updated Feb 26 • 2.64k

demo-leaderboard-backend/results

Preview • Updated Nov 22, 2023 • 2.56k