Spaces:

tuandunghcmut
/

corgi-qwen3-vl-demo

Runtime error

corgi-qwen3-vl-demo / PROGRESS_LOG.md

dung-vpt-uney

Deploy latest CoRGI Gradio demo

b6a01d6 17 days ago

2.7 kB

CoRGI Custom Demo — Progress Log

Keep this log short and chronological. Newest updates at the top.

Reproduced the CoRGI pipeline failure with the real Qwen/Qwen3-VL-8B-Thinking checkpoint and traced it to reasoning outputs that only use ordinal step words.
Taught the text parser to normalize “First/Second step” style markers into numeric indices, refreshed the unit tests to cover the new heuristic, and reran the demo/end-to-end pipeline successfully.
Tidied Qwen generation settings to avoid unused temperature flags when running deterministically.
Validated ROI extraction on a vision-heavy prompt against the real model and hardened prompts so responses stay in structured JSON without verbose preambles.
Added meta-comment pruning so thinking-mode rambles (e.g., redundant “Step 3” reflections) are dropped while preserving genuine reasoning; confirmed with the official demo image that only meaningful steps remain.

Updated default checkpoints to Qwen/Qwen3-VL-8B-Thinking and verified CLI/Gradio/test coverage.
Exercised the real model to capture thinking-style outputs; added parser fallbacks for textual reasoning/ROI responses and stripped <think> tags from answer synthesis.
Extended unit test suite (reasoning, ROI, client helpers) to cover the new parsing paths and ran pytest successfully.

Added optional integration test (corgi_tests/test_integration_qwen.py) gated by CORGI_RUN_QWEN_INTEGRATION for running the real Qwen3-VL model on the official demo asset.
Created runnable example script (examples/demo_qwen_corgi.py) to reproduce the Hugging Face demo prompt locally with structured pipeline logging.
Published Hugging Face Space harness (app.py) and deployment helper (scripts/push_space.sh) including requirements for ZeroGPU tier.
Documented cookbook alignment and inference tips (QWEN_INFERENCE_NOTES.md).
Added CLI runner (corgi.cli) with formatting helpers plus JSON export; authored matching unittest coverage.
Implemented Gradio demo harness (corgi.gradio_app) with markdown reporting and helper utilities for dependency injection.
Expanded unit test suite (CLI + Gradio) and ran pytest corgi_tests successfully (1 skip when gradio missing).
Initialized structured project plan and progress log scaffolding.
Assessed existing modules (corgi.pipeline, corgi.qwen_client, parsers, tests) to identify pending demo features (CLI + Gradio).
Confirmed Qwen3-VL will be the single backbone for reasoning, ROI verification, and answer synthesis.