prithivMLmods commited on
Commit
d575518
·
verified ·
1 Parent(s): 9fc4c47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -25,6 +25,8 @@ datasets:
25
  license: apache-2.0
26
  ---
27
 
 
 
28
  # **Camel-Doc-OCR-080125**
29
 
30
  > The **Camel-Doc-OCR-080125** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, optimized for **Document Retrieval**, **Content Extraction**, and **Analysis Recognition**. Built on top of the Qwen2.5-VL architecture, this model enhances document comprehension capabilities with focused training on the Opendoc2-Analysis-Recognition dataset for superior document analysis and information extraction tasks.
@@ -124,7 +126,7 @@ This model is intended for:
124
  | **Dataset Size** | 230K samples (Modular Combustion of Datasets) |
125
  | **Model Architecture** | `Qwen2_5_VLForConditionalGeneration` |
126
  | **Total Disk Volume** | 400,000 MB |
127
- | **Training Time** | approx. 9,360 seconds (\~2.60 hours) |
128
  | **Warmup Steps** | 750 |
129
  | **Precision** | bfloat16 |
130
 
@@ -145,4 +147,4 @@ This model is intended for:
145
  [https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
146
 
147
  * **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
148
- [https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)
 
25
  license: apache-2.0
26
  ---
27
 
28
+ ![1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/CZM7u91ww9SJPFQiY7YlI.png)
29
+
30
  # **Camel-Doc-OCR-080125**
31
 
32
  > The **Camel-Doc-OCR-080125** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, optimized for **Document Retrieval**, **Content Extraction**, and **Analysis Recognition**. Built on top of the Qwen2.5-VL architecture, this model enhances document comprehension capabilities with focused training on the Opendoc2-Analysis-Recognition dataset for superior document analysis and information extraction tasks.
 
126
  | **Dataset Size** | 230K samples (Modular Combustion of Datasets) |
127
  | **Model Architecture** | `Qwen2_5_VLForConditionalGeneration` |
128
  | **Total Disk Volume** | 400,000 MB |
129
+ | **Training Time** | approx. 9,360(±) seconds (\~2.60 hours) |
130
  | **Warmup Steps** | 750 |
131
  | **Precision** | bfloat16 |
132
 
 
147
  [https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
148
 
149
  * **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
150
+ [https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)