Fine-tuning VLM vs. combining with SAM2 for object tracking, which path should I choose? Looking for experience sharing

#1
by weihongliang - opened

Hello Denis, you’ve done a really great job. Actually, I originally planned to fine-tune a VLM to achieve object tracking, but later I thought of another way—by working with other models (like SAM2). I’m not sure which path to take right now, so I’d like to ask for your opinio, what made you create this model in the first place? Thank you.

Hi, thanks :)
I think it's better to start with models like SAM. The main reason is that VLM is not good enough for this task and models like SAM2( track anything and samurai) shows best result.

Sign up or log in to comment