is this multimodal / VLM?
#1
by
ququwowo
- opened
Hi! I saw "multimodal" mentioned in model card -- is this a vision-language model that can read images? or is this a pure text based LLM?
Thanks.
Hi! I saw "multimodal" mentioned in model card -- is this a vision-language model that can read images? or is this a pure text based LLM?
Thanks.
Seems like it's only text modality, based on their config file
Thank you for your attention. The "multimodal" in S1-Base refers to "scientific modalities" (such as spectra, fields, etc.). This repository is for the large language model in the S1-Base series, which is a text modality model.
ScienceOne-AI
changed discussion status to
closed