SpatialBot is a VLM with spatial understanding and reasoning abilties, by precisely understanding depth maps and using them to do high-level tasks.
In this HF repo, we provide ckpts of SpatialBot-3B with LoRA, which is based on Phi-2 and SigLIP. It can perform well on general VLM tasks and spatial understanding benchmarks like SpatialBench.
You will also need to download pretrained CKPT.
Paper:
https://arxiv.org/abs/2406.13642
GitHub repo:
https://github.com/BAAI-DCAI/SpatialBot
SpatialBench, the benchmark:
https://huggingface.co/datasets/RussRobin/SpatialBench
Merged SpatialBot-3B:
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.