Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Kaichengalex 's Collections
UniME-V2
RWKV-CLIP
UniME
Web-Person Dataset
RealSyn Dataset
Vision-Language Dataset
SFT Dataset
MLLM4Embedding

MLLM4Embedding

updated 13 days ago
Upvote
-

  • UniME

    Collection
    UniME is a series of multimodal large language models trained for learning universal multimodal embedding. • 4 items • Updated May 16 • 4

  • GME: Improving Universal Multimodal Retrieval by Multimodal LLMs

    Paper • 2412.16855 • Published Dec 22, 2024 • 5

  • VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

    Paper • 2410.05160 • Published Oct 7, 2024 • 4

  • VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

    Paper • 2507.04590 • Published Jul 7 • 16

  • UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

    Paper • 2510.13515 • Published 15 days ago • 11

  • SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

    Paper • 2510.12709 • Published 16 days ago • 10
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs