A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers Paper • 2512.23380 • Published 5 days ago • 24
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published 10 days ago • 21
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents Paper • 2512.22322 • Published 8 days ago • 35
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion Paper • 2512.23709 • Published 4 days ago • 40
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 3 days ago • 39
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published 8 days ago • 36
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published 7 days ago • 39
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 7 days ago • 55
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 3 days ago • 60
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 5 days ago • 62
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 5 days ago • 86
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 13 days ago • 10
Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding Paper • 2512.21643 • Published 9 days ago • 10
ProEdit: Inversion-based Editing From Prompts Done Right Paper • 2512.22118 • Published 7 days ago • 16
UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture Paper • 2512.21675 • Published 9 days ago • 24
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents Paper • 2512.22047 • Published 8 days ago • 25
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 15 days ago • 94
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 10 days ago • 59