Document QA Model

This is a fine-tuned document question-answering model based on layoutlmv3-base. It is trained to understand documents using OCR data (via PaddleOCR) and accurately answer questions related to structured information in the document layout.


Model Details

Model Description

  • Model Name: document-qa-model
  • Base Model: microsoft/layoutlmv3-base
  • Fine-tuned by: Lakshya Singh (solo contributor)
  • Languages: English, Spanish, French, German, Italian
  • License: Apache-2.0 (inherited from base model)
  • Intended Use: Extract answers to structured queries from scanned documents
  • Not funded — this project was completed independently.

Model Sources


Uses

Direct Use

This model can be used for:

  • Question Answering on document images (PDFs, invoices, utility bills)
  • Information extraction tasks using OCR and layout-aware understanding

Out-of-Scope Use

  • Not suitable for conversational QA
  • Not suitable for images with no OCR-processed text

Training Details

Dataset

The dataset consisted of:

  • Images of utility bills and documents
  • OCR data with bounding boxes (from PaddleOCR)
  • Queries in English, Spanish, and Chinese
  • Answer spans with match scores and positions

Training Procedure

  • Preprocessing: PaddleOCR was used to extract tokens, positions, and structure
  • Model: LayoutLMv3-base
  • Epochs: 4
  • Learning rate schedule: Shown in image below

Training Metrics

  • F1 Score (validation): training_history.png
  • Loss & Learning Rate Chart: training_history.png

Evaluation

Metrics Used

  • F1 score
  • Match score of predicted spans
  • Token overlap vs ground truth

Summary

The model performs well on document-style QA tasks, especially with:

  • Clearly structured OCR results
  • Document types similar to utility bills, invoices, and forms

How to Use

Downloads last month
48
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lakshya-rawat/document-qa-model

Finetuned
(261)
this model

Dataset used to train lakshya-rawat/document-qa-model