ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction Paper • 2412.12888 • Published Dec 17, 2024
EliGen: Entity-Level Controlled Image Generation with Regional Attention Paper • 2501.01097 • Published Jan 2
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs Paper • 2504.17432 • Published Apr 24 • 39
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published Jan 10 • 52
SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning Paper • 2408.05517 • Published Aug 10, 2024 • 2
FaceChain: A Playground for Human-centric Artificial Intelligence Generated Content Paper • 2308.14256 • Published Aug 28, 2023 • 1
Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key Paper • 2410.10210 • Published Oct 14, 2024 • 6
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models Paper • 2309.00986 • Published Sep 2, 2023 • 21