view article Article How Much Power does a SOTA Open Video Model Use? ⚡🎥 By jdelavande and 2 others • 16 days ago • 13
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • 22 days ago • 109
view article Article Common Pitfalls in Sharing Open Source Models on Hugging Face (and How to Dodge Them) By FriendliAI and 2 others • 17 days ago • 21
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 114
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models Paper • 2505.19223 • Published May 25 • 8
This Time is Different: An Observability Perspective on Time Series Foundation Models Paper • 2505.14766 • Published May 20 • 39
view article Article NVIDIA Cosmos Now Available On Hugging Face For Physical AI Reasoning By PranjaliJoshi and 1 other • May 19 • 25
The Audio-Visual BatVision Dataset for Research on Sight and Sound Paper • 2303.07257 • Published Mar 13, 2023 • 1
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields Paper • 2405.18213 • Published May 28, 2024 • 1
view article Article Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models. By tiiuae and 9 others • May 15 • 35
view article Article Blazingly fast whisper transcriptions with Inference Endpoints By mfuntowicz and 5 others • May 13 • 71
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 481
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published May 5 • 22
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published May 5 • 84
view article Article Welcome Llama 4 Maverick & Scout on Hugging Face! By burtenshaw and 6 others • Apr 5 • 145