view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • 1 day ago • 28
GLiNER-X Collection The Multilingual Named Entity Recognition (NER) model which is capable of identifying any entity type. • 6 items • Updated 3 days ago • 15
view article Article Transformers backend integration in SGLang By marcsun13 and 4 others • 4 days ago • 35
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer Paper • 2407.17140 • Published Jul 24, 2024 • 2
view article Article Featherless AI on Hugging Face Inference Providers 🔥 By sbrandeis and 5 others • 15 days ago • 41
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 14 items • Updated about 7 hours ago • 22
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • 24 days ago • 164
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 94
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 114
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning Paper • 2505.07263 • Published May 12 • 29
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others • Dec 19, 2024 • 657
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Paper • 2505.04601 • Published May 7 • 26
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 459