MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings Paper • 2506.23115 • Published Jun 29 • 37
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Paper • 2502.08468 • Published Feb 12 • 14
Little Giants: Synthesizing High-Quality Embedding Data at Scale Paper • 2410.18634 • Published Oct 24, 2024