Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
5.49.1
metadata
title: CoRGI Qwen3-VL Demo
emoji: 🐶
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.41.1
app_file: app.py
pinned: false
license: apache-2.0
CoRGI Qwen3-VL Demo
This Space showcases the CoRGI reasoning pipeline powered entirely by Qwen/Qwen3-VL-2B-Instruct.
Upload an image, ask a visual question, and the app will:
- Generate structured reasoning steps with visual-verification flags.
- Request region-of-interest evidence for steps that require vision.
- Synthesize a grounded final answer.
Running Locally
pip install -r requirements.txt
python examples/demo_qwen_corgi.py \
--model-id Qwen/Qwen3-VL-2B-Instruct \
--max-steps 3 \
--max-regions 3
To launch the Gradio demo locally:
python app.py
📚 Full Documentation
See docs/ folder for complete documentation:
- 🚀 Quick Start - Begin here!
- 📖 Usage Guide - How to use
- 🔧 Deployment - Deploy to HF Spaces
- 📊 Summary Report - Full overview
Configuration Notes
- Model: Uses
Qwen/Qwen3-VL-2B-Instruct(2B parameters, ~5GB VRAM) - Single GPU: Model loads on single GPU (cuda:0) to avoid memory fragmentation
- Hardware: The Space runs on
cpu-basictier by default - Customization: Set
CORGI_QWEN_MODELenvironment variable to use a different checkpoint - Sliders:
max_stepsandmax_regionscontrol reasoning depth and ROI candidates
UI Overview
- Chain of Thought: Displays the structured reasoning steps with vision flags, alongside the exact prompt/response sent to the model.
- ROI Extraction: Shows the source image with every grounded bounding box plus per-evidence crops, and lists the prompts used for each verification step.
- Evidence Descriptions: Summarises each grounded region (bbox, description, confidence) with the associated ROI prompts.
- Answer Synthesis: Highlights the final answer, supporting context, and the synthesis prompt/response pair.
- Performance: Reports per-stage timings (reasoning, ROI extraction, synthesis) plus overall latency so you can monitor ZeroGPU runtime limits.