view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • 21 days ago • 144
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 484
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 180
view article Article SmolVLM2: Bringing Video Understanding to Every Device By orrzohar and 6 others • Feb 20 • 263
view article Article SmolVLM - small yet mighty Vision Language Model By andito and 4 others • Nov 26, 2024 • 310
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding Paper • 2306.02858 • Published Jun 5, 2023 • 19
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding Paper • 2403.09626 • Published Mar 14, 2024 • 15
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12, 2024 • 77
CroissantLLM: A Truly Bilingual French-English Language Model Paper • 2402.00786 • Published Feb 1, 2024 • 27