-
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Paper • 2407.21770 • Published • 23 -
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Paper • 2407.08583 • Published • 13 -
Vision language models are blind
Paper • 2407.06581 • Published • 83
RainningXY
xxyyy123
AI & ML interests
None yet
Recent Activity
liked
a model
about 23 hours ago
AIDC-AI/Ovis-U1-3B
liked
a Space
1 day ago
AIDC-AI/Ovis-U1-3B
liked
a dataset
10 days ago
sunblaze-ucb/AgentSynth