Update README.md
Browse files
README.md
CHANGED
@@ -21,29 +21,22 @@ pipeline_tag: image-classification
|
|
21 |
|
22 |
# newspaper_classifier_segformer
|
23 |
|
24 |
-
This model is a fine-tuned version of `nvidia/mit-b0` on
|
25 |
-
|
26 |
-
|
27 |
-
## Model
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
-
|
33 |
-
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
Intended Uses:
|
41 |
-
- Newspaper Image Classification: The model is intended to classify newspaper images into segment and no_segment categories.
|
42 |
-
- OCR Preprocessing: It can be used as a preprocessing step for OCR tasks to identify images that will require further text localization.
|
43 |
-
|
44 |
-
Limitations:
|
45 |
-
-Domain-Specific: The model is fine-tuned on the `taresco/newspaper_ocr` dataset and may not generalize well to other types of images or domains.
|
46 |
-
- Image Quality: The model's performance may degrade on low-quality or noisy images.
|
47 |
|
48 |
|
49 |
## Training and evaluation data
|
|
|
21 |
|
22 |
# newspaper_classifier_segformer
|
23 |
|
24 |
+
This model is a fine-tuned version of `nvidia/mit-b0` on a document OCR dataset. It classifies text document images into two categories: those requiring special segmentation processing (`segment`) and those that don't (`no_segment`). This classification is a critical preprocessing step in our OCR pipeline, enabling optimized document processing paths.
|
25 |
+
|
26 |
+
|
27 |
+
## Model Details
|
28 |
+
- **Base Architecture**: SegFormer (`nvidia/mit-b0`) - a transformer-based architecture that balances efficiency and performance for vision tasks
|
29 |
+
- **Training Dataset**: `taresco/document_ocr` - specialized collection of text document images with segmentation annotations
|
30 |
+
- **Input Format**: RGB images resized to 512×512 pixels
|
31 |
+
- **Output Classes**:
|
32 |
+
- `segment`: Images containing two or more distinct, unrelated text segments that require special OCR processing
|
33 |
+
- `no_segment`: Images containing single, cohesive content that can follow standard
|
34 |
+
|
35 |
+
## Intended Uses & Applications
|
36 |
+
- **OCR Pipeline Integration**: Primary use is as a preprocessing classifier in OCR workflows for document digitization
|
37 |
+
- **Document Routing**: Automatically route documents to specialized segmentation processing when needed
|
38 |
+
- **Batch Processing**: Efficiently handle large collections of document archives by applying appropriate processing techniques
|
39 |
+
- **Digital Library Processing**: Support for historical text document digitization projects
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
|
42 |
## Training and evaluation data
|