SigLIP or SigLIP2 encoder?
SigLIP or SigLIP2 encoder?
Yes, but which SigLIP checkpoint did Gemma3 use? SigLIP2 or SigLIP?
Thank you!
Orr
@orrzohar
: A or B?
@GopiUppari
: Yes, ...
🤣
@orrzohar
Yes, SigLIP 2 utilizes a similar encoder architecture to SigLIP. In Gemma 3, they used a 400M-parameter variant of the SigLIP vision encoder.
SigLIP-So400m
@prithivMLmods
Trust me, I am familiar with SigLIP and SigLIP2. Both have shaped-optimized model variants. I know.
I just want to know WHICH was used.
Are you from Google org (there is no Google tag, and I already checked the technical report and all model configs, and you can't tell from those)?
Do you know that they used SigLIP-SO400M and not SigLIP2-SO-400M?
Thanks
Orr
No, I'm not from Google Org. I just read your discussion, so I responded.
I also analyzed the technical report, but I didn’t see anything about it.
I remember reading in an article, possibly from Gradient Flow about Gemma 3 (March, mid). It clearly mentioned that they used a 400M-parameter variant of the SigLIP vision encoder.
@orrzohar
Edit :
Yeah, this newsletter!
https://gradientflow.com/gemma-3-what-you-need-to-know/?utm_source=chatgpt.com
I’ve had this question since the release. Rather than guessing, let's ask the organization directly once again.
Hi @GopiUppari , can you please tell me exactly which vision encoder (SigLIP or SigLIP2) is used in the Gemma 3 family of models? Is it the SigLIP-SO400M?
Thankyou !
Prithiv