nkkbr commited on
Commit
e119ac2
·
verified ·
1 Parent(s): 8185a7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -4
README.md CHANGED
@@ -63,16 +63,12 @@ model-index:
63
 
64
  # ViCA-7B: Visuospatial Cognitive Assistant
65
 
66
- [![arXiv](https://img.shields.io/badge/arXiv-2505.12312-B31B1B?logo=arxiv&link=https://arxiv.org/abs/2505.12312)](https://arxiv.org/abs/2505.12312)
67
-
68
  > You may also be interested in our other project, **ViCA2**. Please refer to the following links:
69
 
70
  [![GitHub](https://img.shields.io/badge/GitHub-ViCA2-181717?logo=github&logoColor=white)](https://github.com/nkkbr/ViCA)
71
 
72
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-ViCA2-blue)](https://huggingface.co/nkkbr/ViCA2)
73
 
74
- [![arXiv](https://img.shields.io/badge/arXiv-2505.12363-B31B1B?logo=arxiv&link=https://arxiv.org/abs/2505.12363)](https://arxiv.org/abs/2505.12363)
75
-
76
  ## Overview
77
 
78
  **ViCA-7B** is a vision-language model specifically fine-tuned for *visuospatial reasoning* in indoor video environments. Built upon the LLaVA-Video-7B-Qwen2 architecture, it is trained using our newly proposed **ViCA-322K dataset**, which emphasizes both structured spatial annotations and complex instruction-based reasoning tasks.
 
63
 
64
  # ViCA-7B: Visuospatial Cognitive Assistant
65
 
 
 
66
  > You may also be interested in our other project, **ViCA2**. Please refer to the following links:
67
 
68
  [![GitHub](https://img.shields.io/badge/GitHub-ViCA2-181717?logo=github&logoColor=white)](https://github.com/nkkbr/ViCA)
69
 
70
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-ViCA2-blue)](https://huggingface.co/nkkbr/ViCA2)
71
 
 
 
72
  ## Overview
73
 
74
  **ViCA-7B** is a vision-language model specifically fine-tuned for *visuospatial reasoning* in indoor video environments. Built upon the LLaVA-Video-7B-Qwen2 architecture, it is trained using our newly proposed **ViCA-322K dataset**, which emphasizes both structured spatial annotations and complex instruction-based reasoning tasks.