view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 3 days ago • 49
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 4 days ago • 86
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper • 2501.07171 • Published Jan 13 • 50
Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper • 2502.07617 • Published 10 days ago • 27
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 205
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated 15 days ago • 49
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 1 day ago • 238
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published Nov 21, 2024 • 22
Diffusers Guides Collection Collection of diffusers guides and their respective spaces • 2 items • Updated Oct 9, 2024 • 2
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 13 days ago • 192