Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Cohere
Cerebras
fal
Replicate
Nebius AI Studio
Fireworks
Together AI
Hyperbolic
Novita
SambaNova
HF Inference API
Misc
Reset Misc
multimodal
Inference Endpoints
text-generation-inference
custom_code
4-bit precision
Eval Results
Merge
8-bit precision
Mixture of Experts
Misc with no match
text-embeddings-inference
Carbon Emissions
Apply filters
Models
905
Full-text search
Edit filters
Sort: Trending
Active filters:
multimodal
Clear all
CogACT/CogACT-Base
Robotics
•
Updated
Dec 4, 2024
•
6.22k
•
12
CogACT/CogACT-Large
Robotics
•
Updated
Dec 4, 2024
•
85
•
3
CogACT/CogACT-Small
Robotics
•
Updated
Dec 4, 2024
•
640
•
4
rhymes-ai/Aria-Base-64K
Image-Text-to-Text
•
Updated
Dec 1, 2024
•
42
•
14
rhymes-ai/Aria-Chat
Image-Text-to-Text
•
Updated
Dec 15, 2024
•
120
•
11
rhymes-ai/Aria-Base-8K
Image-Text-to-Text
•
Updated
Dec 1, 2024
•
29
•
9
AnyModal/LaTeX-OCR-Llama-3.2-1B
Updated
Dec 23, 2024
•
6
unsloth/Pixtral-12B-2409-unsloth-bnb-4bit
Image-Text-to-Text
•
Updated
Dec 4, 2024
•
5k
•
10
lmstudio-community/Qwen2-VL-7B-Instruct-GGUF
Image-Text-to-Text
•
Updated
Jan 6
•
3.99k
•
6
second-state/Qwen2-VL-7B-Instruct-GGUF
Image-Text-to-Text
•
Updated
Jan 11
•
169
•
5
second-state/Qwen2-VL-2B-Instruct-GGUF
Image-Text-to-Text
•
Updated
Jan 11
•
189
•
3
Sri-Vigneshwar-DJ/Apollo-LMMs-Apollo-7B-t32
Video-Text-to-Text
•
Updated
Jan 1
•
13
•
1
osunlp/UGround-V1-7B
Image-Text-to-Text
•
Updated
27 days ago
•
2.33k
•
13
mradermacher/UGround-V1-7B-GGUF
Updated
Jan 4
•
27
•
1
osunlp/UGround-V1-72B-Preview
Image-Text-to-Text
•
Updated
Jan 12
•
18
•
2
nintwentydo/Razorback-12B-v0.1
Image-Text-to-Text
•
Updated
Jan 10
•
8
•
3
nintwentydo/Razorback-12B-v0.2
Image-Text-to-Text
•
Updated
Jan 10
•
14
•
3
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448
Video-Text-to-Text
•
Updated
Mar 16
•
1.38k
•
19
OpenGVLab/VideoChat-Flash-Qwen2-7B_res224
Video-Text-to-Text
•
Updated
Mar 16
•
87
•
6
OpenGVLab/VideoChat-Flash-Qwen2-7B_res448
Video-Text-to-Text
•
Updated
Mar 16
•
681
•
12
osunlp/UGround-V1-72B
Image-Text-to-Text
•
Updated
Jan 23
•
63
•
4
tahamajs/plamma
Updated
Feb 9
•
3
•
3
Minthy/ToriiGate-v0.4-7B
Image-Text-to-Text
•
Updated
Jan 22
•
724
•
38
Minthy/ToriiGate-v0.4-2B
Image-Text-to-Text
•
Updated
Jan 19
•
125
•
11
ByteDance-Seed/UI-TARS-2B-SFT
Image-Text-to-Text
•
Updated
Jan 25
•
5.78k
•
20
mradermacher/UI-TARS-7B-DPO-GGUF
Updated
27 days ago
•
554
•
9
OpenGVLab/InternVideo2_5_Chat_8B
Video-Text-to-Text
•
Updated
Feb 18
•
6.33k
•
63
OpenGVLab/InternVL_2_5_HiCo_R64
Video-Text-to-Text
•
Updated
about 17 hours ago
•
201
•
3
lmstudio-community/UI-TARS-7B-DPO-GGUF
Image-Text-to-Text
•
Updated
Jan 23
•
718
•
7
unsloth/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
Updated
2 days ago
•
14.7k
•
13
Previous
1
2
3
4
5
...
31
Next