
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
•
2B
•
Updated
•
97.8k
•
530
State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct. Check our blog: https://huggingface.co/blog/smolvlm
Generate answers by combining text and images