Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated 7 days ago • 26
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 4 days ago • 83
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 7 days ago • 61
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 3 days ago • 277
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published 9 days ago • 7
CodeMonkeys: Scaling Test-Time Compute for Software Engineering Paper • 2501.14723 • Published 6 days ago • 6
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity Paper • 2501.16295 • Published 3 days ago • 5
Are Vision Language Models Texture or Shape Biased and Can We Steer Them? Paper • 2403.09193 • Published Mar 14, 2024 • 8
iFormer: Integrating ConvNet and Transformer for Mobile Application Paper • 2501.15369 • Published 4 days ago • 9
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation Paper • 2501.15907 • Published 3 days ago • 14
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published 7 days ago • 28
GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing Paper • 2501.13925 • Published 7 days ago • 5