prithivMLmods commited on
Commit
6eba2e6
·
verified ·
1 Parent(s): 4cd14fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -36,4 +36,21 @@ tags:
36
  | **Precision** | bfloat16 |
37
 
38
  > [!note]
39
- > The open dataset image-text response will be updated soon.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  | **Precision** | bfloat16 |
37
 
38
  > [!note]
39
+ > The open dataset image-text response will be updated soon.
40
+
41
+ ## References
42
+
43
+ - **DocVLM: Make Your VLM an Efficient Reader**
44
+ [https://arxiv.org/pdf/2412.08746v1](https://arxiv.org/pdf/2412.08746v1)
45
+
46
+ - **YaRN: Efficient Context Window Extension of Large Language Models**
47
+ [https://arxiv.org/pdf/2309.00071](https://arxiv.org/pdf/2309.00071)
48
+
49
+ - **Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution**
50
+ [https://arxiv.org/pdf/2409.12191](https://arxiv.org/pdf/2409.12191)
51
+
52
+ - **Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond**
53
+ [https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
54
+
55
+ - **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
56
+ [https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)