Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published 4 days ago • 60
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published 8 days ago • 95
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published 11 days ago • 58
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published 9 days ago • 55
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 17 days ago • 120
Scaling Image and Video Generation via Test-Time Evolutionary Search Paper • 2505.17618 • Published 22 days ago • 41
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published 20 days ago • 145
Vid2World: Crafting Video Diffusion Models to Interactive World Models Paper • 2505.14357 • Published 25 days ago • 26
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published 25 days ago • 130
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published May 12 • 60
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder Paper • 2505.07916 • Published May 12 • 124
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image Paper • 2502.12894 • Published Feb 18 • 13
LightLab: Controlling Light Sources in Images with Diffusion Models Paper • 2505.09608 • Published about 1 month ago • 31
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published about 1 month ago • 93