A newer version of the Gradio SDK is available:
5.47.2
metadata
title: Les Audits d'Affaires - Leaderboard
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
license: mit
short_description: Leaderboard français pour LLMs sur droit des affaires
Les Audits d'Affaires - Leaderboard
Performance dashboard for LLMs on French business law benchmark with HuggingFace Datasets integration.
🚀 Setup Complete!
HuggingFace Datasets
- Requests:
legmlai/laal-requests
- tracks evaluation requests - Results:
legmlai/laal-results
- stores evaluation results
Current Models (with 0 scores)
Qwen/Qwen3-14B
(Alibaba)jpacifico/Chocolatine-2-14B-Instruct-v2.0.3
(jpacifico)meta-llama/Llama-3.1-8B-Instruct
(Meta)
🏃♂️ Quick Start
Prerequisites
export HF_TOKEN=your_huggingface_token
Run Leaderboard
cd les-audites-affaires-leadboard
source venv/bin/activate
python app.py
The leaderboard will be available at: http://127.0.0.1:7860
📁 Project Structure
Core Files
app.py
- Main leaderboard application with HuggingFace integrationdataset_manager.py
- HuggingFace datasets managementrequirements.txt
- Python dependencies
Setup Scripts
create_datasets.py
- Initialize HuggingFace datasetssetup_initial_models.py
- Add initial models with 0 scores
✨ Features
Live Leaderboard
- Real-time data from HuggingFace datasets
- Automatic ranking and scoring
- Category-wise performance breakdown
- Interactive comparison charts
Model Submissions
- Submit models directly through the UI
- Automatic request tracking
- Email notifications for updates
Pipeline Status Tracking
- 📊 Real-time status: Pending, Processing, Completed, Failed
- 📋 Recent evaluation requests table
- 🔄 Pipeline progress monitoring
- ⏳ Request status updates with emojis
Data Management
- All data stored in HuggingFace datasets
- Refresh button for live updates
- Persistent across sessions
- Status update automation ready
Token Management
- Graceful handling when HF_TOKEN not configured
- Clear instructions for Hugging Face Spaces deployment
- Read-only mode when token unavailable
🔄 Next Steps
- Deploy to Hugging Face Spaces: Upload to HF Spaces with HF_TOKEN secret
- Connect Evaluation Harness: Use
automation_example.py
as integration guide - Real Evaluations: Replace 0 scores with actual evaluation results
- Webhooks: Set up automatic updates from evaluation pipeline
- Enhanced Analytics: Add more detailed performance breakdowns
📊 Dataset Schema
Requests Dataset
request_id
,model_name
,model_provider
,request_type
request_status
,contact_email
,request_timestamp
Results Dataset
result_id
,request_id
,model_name
,model_provider
overall_score
,score_action_requise
,score_delai_legal
score_documents_obligatoires
,score_impact_financier
score_consequences_non_conformite
,evaluation_timestamp
,is_published
Created by: Mohamad Alhajar (legml.ai)
License: MIT