LLM - a xxyyy123 Collection

xxyyy123 's Collections

LLM

Align

Dataset

LLM

updated Sep 25, 2024

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31, 2024 • 23
VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 42
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Paper • 2407.08583 • Published Jul 11, 2024 • 13
Vision language models are blind

Paper • 2407.06581 • Published Jul 9, 2024 • 83
ColPali: Efficient Document Retrieval with Vision Language Models

Paper • 2407.01449 • Published Jun 27, 2024 • 48
Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report

Paper • 2406.11403 • Published Jun 17, 2024 • 4
AIDC-AI/Ovis1.6-Gemma2-9B

Image-Text-to-Text • Updated Feb 26 • 906 • 270