You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

🧾 Qwen2 7B Vision Invoice Extraction

This model is fine-tuned on invoice data to extract structured information from invoice images. It uses the Unsloth framework for fast and memory-efficient training.


πŸ“¦ How to Use (Inference)

Make sure to install the necessary dependencies first:

pip install unsloth torch torchvision pillow
from unsloth import FastVisionModel
from PIL import Image
import re
import json
import torch 

device = "cuda" if torch.cuda.is_available() else "cpu"

model_nf, tokenizer_nf = FastVisionModel.from_pretrained(
    model_name="ashishscapsitech123/qwen2_7b_invoice_extraction",
    load_in_4bit=True,
    device_map={"": device}, 
)

model_nf = model_nf.to(device)
FastVisionModel.for_inference(model_nf)

# Load the invoice image
image = [Image.open("testing_image/1.png")]

# Define the structured prompt
instruction = """You are an expert invoice parser. Extract and return only the following JSON structure from the invoice image (Do not include any fields value in the Product key if that key and value particularly not present in that Table. Analyze the Description carefully):

IMPORTANT RULES:
1. Return the JSON with ALL fields in the same order, even if some values are missing and do not add any extra field.
2. If a value is not found in the image, use `null` (not an empty string).
3. DO NOT skip or rename any field from the given JSON structure.
4. Strictly maintain the JSON structure and follow the exact keys to extract under the `products` key.

JSON Structure:
{
  "supplierName": "",
  "supplierAddress": "",
  "mobileNumber": null,
  "email": null,
  "website": null,
  "vatNumber": null,
  "accountName": null,
  "sortCode": "",
  "accountNumber": "",
  "invoiceNumber": "",
  "poReference": null,
  "date": "",
  "dueDate": "",
  "products": [
    {
      "description": "",
      "quantity": "",
      "discountAmount": "",
      "discount%": "",
      "vatAmount": "",
      "vat%": "",
      "unitPrice": "",
      "net": ""
    }
  ],
  "freightTotal": null,
  "totalVat": {
    "vat%": "",
    "vatAmount": ""
  },
  "totalDiscount": {
    "discount%": null,
    "discountAmount": null
  },
  "totalAmount": "",
  "netAmount": ""
}
"""

# Format messages for vision chat model
messages = [
    {"role": "user", "content": [
        {"type": "image", "image": image[0]},
        {"type": "text", "text": instruction}
    ]}
]

input_text = tokenizer_nf.apply_chat_template(messages, add_generation_prompt=True)

inputs = tokenizer_nf(
    [image[0].resize((640, 640))],
    input_text,
    add_special_tokens=False,
    return_tensors="pt"
).to(device)

output_tokens = model_nf.generate(
    **inputs,
    max_new_tokens=2048,
    use_cache=True,
    temperature=0.1,
    min_p=0.1
)

output_text = tokenizer_nf.decode(output_tokens[0], skip_special_tokens=True)

# Extract JSON from response
match = re.search(r"assistant\s*(\{.*\})", output_text, re.DOTALL)
if match:
    json_str = match.group(1)
    json_str = json_str.replace("'", '"')
    json_str = re.sub(r'\bnan\b', 'null', json_str, flags=re.IGNORECASE)

    try:
        data = json.loads(json_str)
        print("Extracted JSON:")
        print(data)
    except json.JSONDecodeError as e:
        print("Error parsing JSON:", e)
else:
    print("No JSON found in model output.")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support