Sa2VA-i: Improving Sa2VA Results with Consistent Training and Inference

Acknowledgement

We thank Sa2VA authors for their contribution.

@article{sa2va,
  title={Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos},
  author={Yuan, Haobo and Li, Xiangtai and Zhang, Tao and Huang, Zilong Huang and Xu, Shilin and Ji, Shunping and Tong, Yunhai and Qi, Lu and Feng, Jiashi and Yang, Ming-Hsuan},
  journal={arXiv preprint},
  year={2025}
}
Downloads last month
6
Safetensors
Model size
3.95B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kumuji/Sa2VA-i-4B

Merge model
this model