Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
nkkbr
/
ViCA
like
0
Video-Text-to-Text
Transformers
Safetensors
nkkbr/ViCA-322K
nkkbr/ViCA-thinking-2.68k
English
llava
text-generation
multimodal
vision-language
video understanding
spatial reasoning
visuospatial cognition
qwen
llava-video
Eval Results
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
nkkbr
commited on
May 3
Commit
314ad38
·
1 Parent(s):
bb2bb30
Add model card with base_model and tags
Browse files
Files changed (1)
hide
show
README.md
+7
-0
README.md
ADDED
Viewed
@@ -0,0 +1,7 @@
1
+
---
2
+
base_model: lmms-lab/LLaVA-Video-7B-Qwen2
3
+
tags:
4
+
- llava
5
+
- vision-language
6
+
- fine-tuned
7
+
---