zhibinlan/LLaVE-0.5B
Image-Text-to-Text
•
0.9B
•
Updated
•
38.8k
•
7
LLaVE is a series of large language and vision embedding models trained on a variety of multimodal embedding datasets