Request: DOI

#122

by HarshDoshi21 - opened 23 days ago

23 days ago

We are requesting access to Llama-3.2-11B-Vision-Instruct for evaluation and development purposes.
Our intended use is to build an AI-powered form/document understanding system that can accurately extract structured data from semi-structured scanned documents (such as certificates, government forms, and business records).
Production/Business Focused
The model will be used in the following contexts:

Learning & Research: To explore multimodal LLM capabilities for text–image understanding.

Business/Production Use: As part of a document intelligence pipeline to automate manual data entry, improve efficiency, and ensure accuracy in enterprise workflows.

Non-Malicious Use: No generation of harmful, biased, or inappropriate content. Strictly applied to OCR + structured data extraction.

This aligns with Hugging Face’s mission of safe, beneficial AI adoption. Access will allow us to test advanced vision-language reasoning for real-world document processing use cases.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment