Emu3.5 Collection Native Multimodal Models are World Learners π β’ 4 items β’ Updated 8 days ago β’ 72
FG-CLIP 2 Collection FG-CLIP 2 is the foundation model for fine-grained vision-language understanding in both English and Chinese. β’ 10 items β’ Updated Nov 6, 2025 β’ 5
google/embeddinggemma-300m Sentence Similarity β’ 0.3B β’ Updated Sep 25, 2025 β’ 799k β’ β’ 1.38k
Qwen/Qwen3-30B-A3B-Thinking-2507 Text Generation β’ 31B β’ Updated Aug 17, 2025 β’ 403k β’ β’ 335
prithivMLmods/FLUX.1-Kontext-Cinematic-Relighting Image-to-Image β’ Updated Jul 26, 2025 β’ 20 β’ β’ 12
Runtime error Featured 781 UNO FLUX β‘ 781 Generate customized images using text and multiple images
MiroThinker-v0.1 Collection High performance in deep research and tool use. β’ 7 items β’ Updated Sep 8, 2025 β’ 36