Spaces:

tuandunghcmut
/

corgi-qwen3-vl-demo

Runtime error

App Files Files Community

corgi-qwen3-vl-demo / PROGRESS_LOG.md

dung-vpt-uney

Deploy latest CoRGI Gradio demo

9c4a163 21 days ago

preview code

raw

history blame contribute delete

3.82 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

CoRGI Custom Demo — Progress Log

Keep this log short and chronological. Newest updates at the top.

2024-10-22

Reproduced the CoRGI pipeline failure with the real Qwen/Qwen3-VL-8B-Thinking checkpoint and traced it to reasoning outputs that only use ordinal step words.
Taught the text parser to normalize “First/Second step” style markers into numeric indices, refreshed the unit tests to cover the new heuristic, and reran the demo/end-to-end pipeline successfully.
Tidied Qwen generation settings to avoid unused temperature flags when running deterministically.
Validated ROI extraction on a vision-heavy prompt against the real model and hardened prompts so responses stay in structured JSON without verbose preambles.
Added meta-comment pruning so thinking-mode rambles (e.g., redundant “Step 3” reflections) are dropped while preserving genuine reasoning; confirmed with the official demo image that only meaningful steps remain.
Authored a metadata-rich README.md (with Hugging Face Space front matter) so the deployed Space renders without configuration errors.
Updated app.py to fall back to demo.queue() when concurrency_count is unsupported, fixing the runtime error seen on Spaces.
Added ZeroGPU support: cached model/processor globals live on CUDA when available, a @spaces.GPU-decorated executor handles pipeline runs, and requirements now include the spaces SDK.
Introduced structured logging for the app (app.py) and pipeline execution to trace model loads, cache hits, and Gradio lifecycle events on Spaces.
Reworked the Gradio UI to show per-step panels with annotated evidence galleries, giving each CoRGI reasoning step its own window alongside the final synthesized answer.
Preloaded the default Qwen3-VL model/tokenizer at import so Spaces load the GPU weights before serving requests.
Switched inference to bfloat16, tightened defaults (max steps/regions = 3), added per-stage timers, and moved the @spaces.GPU decorator down to the raw _chat call so each generation stays within the 120 s ZeroGPU budget.

2024-10-21

Updated default checkpoints to Qwen/Qwen3-VL-8B-Thinking and verified CLI/Gradio/test coverage.
Exercised the real model to capture thinking-style outputs; added parser fallbacks for textual reasoning/ROI responses and stripped <think> tags from answer synthesis.
Extended unit test suite (reasoning, ROI, client helpers) to cover the new parsing paths and ran pytest successfully.

2024-10-20

Added optional integration test (corgi_tests/test_integration_qwen.py) gated by CORGI_RUN_QWEN_INTEGRATION for running the real Qwen3-VL model on the official demo asset.
Created runnable example script (examples/demo_qwen_corgi.py) to reproduce the Hugging Face demo prompt locally with structured pipeline logging.
Published Hugging Face Space harness (app.py) and deployment helper (scripts/push_space.sh) including requirements for ZeroGPU tier.
Documented cookbook alignment and inference tips (QWEN_INFERENCE_NOTES.md).
Added CLI runner (corgi.cli) with formatting helpers plus JSON export; authored matching unittest coverage.
Implemented Gradio demo harness (corgi.gradio_app) with markdown reporting and helper utilities for dependency injection.
Expanded unit test suite (CLI + Gradio) and ran pytest corgi_tests successfully (1 skip when gradio missing).
Initialized structured project plan and progress log scaffolding.
Assessed existing modules (corgi.pipeline, corgi.qwen_client, parsers, tests) to identify pending demo features (CLI + Gradio).
Confirmed Qwen3-VL will be the single backbone for reasoning, ROI verification, and answer synthesis.