Sergio Paniego PRO
sergiopaniego
AI & ML interests
None yet
Recent Activity
updated
a model
about 3 hours ago
sergiopaniego/Qwen3-0.6B-SFT-20250908105022
updated
a Space
about 3 hours ago
sergiopaniego/trl-trackio
published
a Space
about 3 hours ago
sergiopaniego/trl-trackio
Organizations
GUI Grounding datasets
π Vision comparison ftw
Spaces to compare vision models β thereβs no single best model, only the best one for your specific use case.
-
Running4040
comparevlms
πCompare vision language models
-
Running on Zero5959
OCR Time Machine
πExtract text from images and XML files using OCR models
-
Running2525
Compare Docvqa Models
π¦Compare different visual question answering
-
Running on CPU Upgrade2323
Compare Clip Siglip
πCompare strong zero-shot image classification models
Vision Language Models: 2025 Update
This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any β’ 11B β’ Updated β’ 174k β’ 1.78k -
Running345345
Qwen2.5 Omni 7B Demo
πGenerate text and speech from text, audio, images, and videos
-
Qwen2.5-Omni Technical Report
Paper β’ 2503.20215 β’ Published β’ 165 -
openbmb/MiniCPM-o-2_6
Any-to-Any β’ 9B β’ Updated β’ 180k β’ 1.23k
Amazing design resources
Vision reasoning datasets
GUI Grounding datasets
My vision Spaces
Vision Spaces created by me
π Vision comparison ftw
Spaces to compare vision models β thereβs no single best model, only the best one for your specific use case.
-
Running4040
comparevlms
πCompare vision language models
-
Running on Zero5959
OCR Time Machine
πExtract text from images and XML files using OCR models
-
Running2525
Compare Docvqa Models
π¦Compare different visual question answering
-
Running on CPU Upgrade2323
Compare Clip Siglip
πCompare strong zero-shot image classification models
π Awesome vision Spaces
Spaces where I've collaborated or that I consider unique!
Vision Language Models: 2025 Update
This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any β’ 11B β’ Updated β’ 174k β’ 1.78k -
Running345345
Qwen2.5 Omni 7B Demo
πGenerate text and speech from text, audio, images, and videos
-
Qwen2.5-Omni Technical Report
Paper β’ 2503.20215 β’ Published β’ 165 -
openbmb/MiniCPM-o-2_6
Any-to-Any β’ 9B β’ Updated β’ 180k β’ 1.23k