File size: 852 Bytes
c1164e4 2586a3b 3273254 c1164e4 3273254 c1164e4 a89699c c1164e4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
---
license: apache-2.0
pipeline_tag: video-text-to-text
library_name: transformers
---
**<center><span style="font-size:2em;">TinyLLaVA-Video-R1</span></center>**
[](https://arxiv.org/abs/2504.09641)[](https://github.com/ZhangXJ199/TinyLLaVA-Video-R1)
This model is obtained by cold-starting [TinyLLaVA-Video](https://huggingface.co/Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-16-512) with 16 manually annotated samples from the NextQA dataset. It serves as the base model for [TinyLLaVA-Video-R1](https://huggingface.co/Zhang199/TinyLLaVA-Video-R1).
The 16 manually annotated samples used for cold-starting have been released [here](https://huggingface.co/datasets/Zhang199/TinyLLaVA-Video-R1-training-data).
|