OmniCorpus Collection A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text • 6 items • Updated Apr 20 • 2
ZeroGUI: Automating Online GUI Learning at Zero Human Cost Paper • 2505.23762 • Published May 29 • 46
A Simple Aerial Detection Baseline of Multimodal Language Models Paper • 2501.09720 • Published Jan 16 • 2
Scalable Vision Language Model Training via High Quality Data Curation Paper • 2501.05952 • Published Jan 10 • 3
view article Article Preference Optimization for Vision Language Models By qgallouedec and 3 others • Jul 10, 2024 • 79