Image input
#8
by
mvsoom
- opened
This looks very promising :) . One question: does it support image input (single image btw)? Did it catastrophically forget the visual modality due to finetuning on literature? Cheers
OK, I see that it is text to text from Gemma-2-9b-it model card.
Would you happen to know of models that are also low on slop but still have image multimodality?
Images are good sources of entropy for writing imho.
Cheers
Ah, good question. tbh I haven't tested a lot of open models with image modality. There will be a lot of gemma 3 fine tunes appearing soon so that might be a good bet. The vanilla instruct gemma 3 can be really good but needs some prompting away from its default safe/slop style.
Alright, thanks :)
mvsoom
changed discussion status to
closed