Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 100
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published 11 days ago • 73
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 19 days ago • 84
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published 11 days ago • 19
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published 6 days ago • 20
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published 7 days ago • 19