GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published 5 days ago • 39
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published 14 days ago • 72
view post Post 2636 Moonshot AI 月之暗面 🌛 @Kimi_Moonshotis just dropped an MoE VLM and an MoE Reasoning VLM on the hub!!Model:https://huggingface.co/collections/moonshotai/kimi-vl-a3b-67f67b6ac91d3b03d382dd85✨3B with MIT license✨Long context windows up to 128K✨Strong multimodal reasoning (36.8% on MathVision, on par with 10x larger models) and agent skills (34.5% on ScreenSpot-Pro) See translation 🔥 7 7 😎 2 2 + Reply
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 4 days ago • 58
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 8 days ago • 141
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published 13 days ago • 33
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published 8 days ago • 59
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published 8 days ago • 59
Science-T2I Collection Addressing Scientific Illusions in Image Synthesis • 9 items • Updated 12 days ago • 3
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 17 days ago • 121
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 20 days ago • 87
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published 26 days ago • 35