An unified model for multimodal understanding, text-to-image generation, and image editing.
AI & ML interests
None defined yet.
Recent Activity
An unified model for multimodal understanding, text-to-image generation, and image editing.
Our latest advancement in multi-modal large language models (MLLMs)
With 29B parameters, Ovis1.6-Gemma2-27B achieves exceptional performance in the OpenCompass benchmark, ranking among the top-tier open-source MLLMs.
Ovis1.5 is fully open-source: we release training datasets, training & inference codes, and model weights.