THUdyh commited on
Commit
cdcf37e
·
verified ·
1 Parent(s): f996a4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -17,9 +17,9 @@ Based on Qwen2.5 language model, it is trained on text, image, video and audio d
17
 
18
  Ola offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
19
 
20
- - **Repository:** https://github.com/xxxxx
21
  - **Languages:** English, Chinese
22
- - **Paper:** https://arxiv.org/abs/2501.xxxx
23
 
24
  ## Use
25
 
@@ -314,3 +314,9 @@ def ola_inference(multimodal, audio_path):
314
  - **Code:** Pytorch
315
 
316
  ## Citation
 
 
 
 
 
 
 
17
 
18
  Ola offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
19
 
20
+ - **Repository:** https://github.com/Ola-Omni/Ola
21
  - **Languages:** English, Chinese
22
+ - **Paper:** https://arxiv.org/abs/2502.04328
23
 
24
  ## Use
25
 
 
314
  - **Code:** Pytorch
315
 
316
  ## Citation
317
+ @article{liu2025ola,
318
+ title={Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment},
319
+ author={Liu, Zuyan and Dong, Yuhao and Wang, Jiahui and Liu, Ziwei and Hu, Winston and Lu, Jiwen and Rao, Yongming},
320
+ journal={arXiv preprint arXiv:2502.04328},
321
+ year={2025}
322
+ }