Image-Text-to-Text
Transformers
Safetensors
English
idefics3
multimodal
vision
conversational