MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs Paper • 2510.25867 • Published 3 days ago • 5
Exploring Conditions for Diffusion models in Robotic Control Paper • 2510.15510 • Published 15 days ago • 36
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published 10 days ago • 28
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published 1 day ago • 78
view article Article On the Shifting Global Compute Landscape By huggingface and 1 other • 3 days ago • 23
view article Article Hall of Multimodal OCR VLMs and Demonstrations By prithivMLmods • about 20 hours ago • 3
view article Article Granite 4.0 Nano: Just how small can you go? By ibm-granite and 1 other • 4 days ago • 89
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published 5 days ago • 52
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published 4 days ago • 62
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models Paper • 2504.14032 • Published Apr 18 • 7
Open Multimodal Retrieval-Augmented Factual Image Generation Paper • 2510.22521 • Published 6 days ago • 30
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation Paper • 2510.23581 • Published 5 days ago • 41
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published 5 days ago • 115
Heavy Labels Out! Dataset Distillation with Label Space Lightening Paper • 2408.08201 • Published Aug 15, 2024 • 21
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation Paper • 2510.21583 • Published 8 days ago • 30
view article Article Promoter-GPT: Writing DNA Instructions with Language Models By hugging-science • 10 days ago • 22