Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yjj23
/
minivlm
like
0
PyTorch
English
vlm_model
vision-language-model
multimodal
vision
qwen
siglip
Model card
Files
Files and versions
Community
2
VLM Model: Qwen2.5 + SigLIP
VLM Model: Qwen2.5 + SigLIP
This model combines:
Vision encoder: google/siglip-base-patch16-224
Language model: Qwen/Qwen2.5-0.5B-Instruct
Downloads last month
484
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support