is this multimodal / VLM?

#1
by ququwowo - opened

Hi! I saw "multimodal" mentioned in model card -- is this a vision-language model that can read images? or is this a pure text based LLM?

Thanks.

Hi! I saw "multimodal" mentioned in model card -- is this a vision-language model that can read images? or is this a pure text based LLM?

Thanks.

Seems like it's only text modality, based on their config file

Thank you for your attention. The "multimodal" in S1-Base refers to "scientific modalities" (such as spectra, fields, etc.). This repository is for the large language model in the S1-Base series, which is a text modality model.

ScienceOne-AI changed discussion status to closed

Sign up or log in to comment