temp / QUICKSTART.md
CheeksTheGeek's picture
Initial commit: LLM Code Deployment System
c5292d8 unverified

Quick Start Guide

Installation

  1. Install dependencies with uv
uv sync
  1. Install Playwright browsers
uv run playwright install chromium
  1. Set up environment variables
cp .env.example .env

Edit .env and configure at minimum:

  • STUDENT_SECRET - Your secret key
  • STUDENT_EMAIL - Your email
  • GITHUB_TOKEN - GitHub personal access token
  • GITHUB_USERNAME - Your GitHub username
  • ANTHROPIC_API_KEY or OPENAI_API_KEY - LLM API key

Running the System

Option 1: Using main.py CLI

Start Student API:

uv run python main.py student-api

Start Instructor API:

uv run python main.py instructor-api

Initialize Database:

uv run python main.py init-db

Run Round 1:

uv run python main.py round1

Run Evaluation:

uv run python main.py evaluate

Run Round 2:

uv run python main.py round2

Option 2: Direct module execution

Start Student API:

uv run python -m student.api

Start Instructor API:

uv run python -m instructor.api

Testing the System

1. Test Student API

Start the student API:

uv run python main.py student-api

In another terminal, send a test request:

curl -X POST http://localhost:8000/api/build \
  -H "Content-Type: application/json" \
  -d '{
    "email": "[email protected]",
    "secret": "your-secret",
    "task": "test-task-abc",
    "round": 1,
    "nonce": "unique-nonce-123",
    "brief": "Create a simple Hello World page with Bootstrap",
    "checks": ["Page displays Hello World", "Bootstrap is loaded"],
    "evaluation_url": "http://localhost:8001/api/evaluate",
    "attachments": []
  }'

Check status:

curl http://localhost:8000/api/status/test-task-abc

2. Test Instructor Workflow

Start Instructor API:

uv run python main.py instructor-api

Initialize Database:

uv run python main.py init-db

Create submissions.csv:

timestamp,email,endpoint,secret
2025-01-15T10:00:00,[email protected],http://localhost:8000/api/build,your-secret

Run Round 1:

uv run python main.py round1

Run Evaluation:

uv run python main.py evaluate

Check Results:

curl http://localhost:8001/api/results/[email protected]

Project Structure Overview

student/              # Student side (receives tasks, generates code)
β”œβ”€β”€ api.py           # API endpoint
β”œβ”€β”€ code_generator.py # LLM code generation
β”œβ”€β”€ github_manager.py # GitHub operations
└── notification_client.py # Notify evaluation

instructor/          # Instructor side (generates tasks, evaluates)
β”œβ”€β”€ api.py          # Evaluation endpoint
β”œβ”€β”€ database.py     # Database operations
β”œβ”€β”€ task_templates.py # Template management
β”œβ”€β”€ round1.py       # Generate round 1 tasks
β”œβ”€β”€ round2.py       # Generate round 2 tasks
β”œβ”€β”€ evaluate.py     # Run evaluations
└── checks/         # Evaluation checks
    β”œβ”€β”€ static_checks.py
    β”œβ”€β”€ dynamic_checks.py
    └── llm_checks.py

shared/             # Shared utilities
β”œβ”€β”€ config.py       # Configuration
β”œβ”€β”€ models.py       # Data models
β”œβ”€β”€ logger.py       # Logging
└── utils.py        # Utilities

templates/          # Task templates (YAML)

Next Steps

  1. Configure your .env file with actual credentials
  2. Set up a PostgreSQL database (if using instructor features)
  3. Review the task templates in templates/
  4. Test the student API with a simple request
  5. Set up the instructor system for evaluation

Common Issues

Import errors: Make sure you run commands with uv run prefix

GitHub auth errors: Verify GITHUB_TOKEN in .env has proper permissions

Database errors: Make sure PostgreSQL is running and DATABASE_URL is correct

LLM errors: Check your API key and quota

For more details, see README.md