GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published 17 days ago • 36
Discrete Diffusion in Large Language and Multimodal Models: A Survey Paper • 2506.13759 • Published 19 days ago • 41
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published 23 days ago • 65
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency Paper • 2506.08343 • Published 25 days ago • 48
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence Paper • 2506.10600 • Published 23 days ago • 7
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework Paper • 2506.10741 • Published 23 days ago • 27
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games Paper • 2506.05309 • Published 30 days ago • 14
SpatialLM: Training Large Language Models for Structured Indoor Modeling Paper • 2506.07491 • Published 26 days ago • 38
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Paper • 2506.05573 • Published 30 days ago • 69
Audio-Aware Large Language Models as Judges for Speaking Styles Paper • 2506.05984 • Published 29 days ago • 14
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation Paper • 2506.04225 • Published about 1 month ago • 25
Inference-Time Hyper-Scaling with KV Cache Compression Paper • 2506.05345 • Published 30 days ago • 27
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 258