STEVE: AStep Verification Pipeline for Computer-use Agent Training Paper • 2503.12532 • Published 7 days ago • 13
Efficient Personalization of Quantized Diffusion Model without Backpropagation Paper • 2503.14868 • Published 4 days ago • 19
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 4 days ago • 17
ViSpeak: Visual Instruction Feedback in Streaming Videos Paper • 2503.12769 • Published 6 days ago • 8
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper • 2503.12963 • Published 6 days ago • 5
GKG-LLM: A Unified Framework for Generalized Knowledge Graph Construction Paper • 2503.11227 • Published 9 days ago • 8
Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning Paper • 2503.13360 • Published 6 days ago • 5