view article Article Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub By nvidia and 10 others • 9 days ago • 23
Running 7 7 YOLOv11 Document Layout Analysis 🏃 inference example of trained YOLOv11-x on DocLayNet dataset.
unsloth/Qwen2.5-VL-7B-Instruct-unsloth-bnb-4bit Image-Text-to-Text • 5B • Updated May 12 • 38.2k • 34
view article Article SigLIP 2: A better multilingual vision language encoder By ariG23498 and 2 others • Feb 21 • 172
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community By Leyo and 2 others • Apr 15, 2024 • 182
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 By tomaarsen • Mar 26 • 143
google/siglip2-giant-opt-patch16-384 Zero-Shot Image Classification • 2B • Updated Feb 21 • 69.1k • 17