prithivMLmods
/

Camel-Doc-OCR-080125

Model card Files Files and versions

prithivMLmods commited on Jul 27

Commit

d575518

·

verified ·

1 Parent(s): 9fc4c47

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -25,6 +25,8 @@ datasets:
 license: apache-2.0
 ---
 # **Camel-Doc-OCR-080125**
 > The **Camel-Doc-OCR-080125** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, optimized for **Document Retrieval**, **Content Extraction**, and **Analysis Recognition**. Built on top of the Qwen2.5-VL architecture, this model enhances document comprehension capabilities with focused training on the Opendoc2-Analysis-Recognition dataset for superior document analysis and information extraction tasks.
@@ -124,7 +126,7 @@ This model is intended for:
 | **Dataset Size**       | 230K samples (Modular Combustion of Datasets) |
 | **Model Architecture** | `Qwen2_5_VLForConditionalGeneration`          |
 | **Total Disk Volume**  | 400,000 MB                                    |
-| **Training Time**      | approx. 9,360 seconds (\~2.60 hours)         |
 | **Warmup Steps**       | 750                                           |
 | **Precision**          | bfloat16                                      |
@@ -145,4 +147,4 @@ This model is intended for:
   [https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
 * **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
-  [https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)

 license: apache-2.0
 ---
+![1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/CZM7u91ww9SJPFQiY7YlI.png)
 # **Camel-Doc-OCR-080125**
 > The **Camel-Doc-OCR-080125** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, optimized for **Document Retrieval**, **Content Extraction**, and **Analysis Recognition**. Built on top of the Qwen2.5-VL architecture, this model enhances document comprehension capabilities with focused training on the Opendoc2-Analysis-Recognition dataset for superior document analysis and information extraction tasks.
 | **Dataset Size**       | 230K samples (Modular Combustion of Datasets) |
 | **Model Architecture** | `Qwen2_5_VLForConditionalGeneration`          |
 | **Total Disk Volume**  | 400,000 MB                                    |
+| **Training Time**      | approx. 9,360(±) seconds (\~2.60 hours)         |
 | **Warmup Steps**       | 750                                           |
 | **Precision**          | bfloat16                                      |
   [https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
 * **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
+  [https://arxiv.org/pdf/2412.02210](https://arxiv.org/pdf/2412.02210)