Request: DOI

#122
by HarshDoshi21 - opened

We are requesting access to Llama-3.2-11B-Vision-Instruct for evaluation and development purposes.
Our intended use is to build an AI-powered form/document understanding system that can accurately extract structured data from semi-structured scanned documents (such as certificates, government forms, and business records).
Production/Business Focused
The model will be used in the following contexts:

Learning & Research: To explore multimodal LLM capabilities for text–image understanding.

Business/Production Use: As part of a document intelligence pipeline to automate manual data entry, improve efficiency, and ensure accuracy in enterprise workflows.

Non-Malicious Use: No generation of harmful, biased, or inappropriate content. Strictly applied to OCR + structured data extraction.

This aligns with Hugging Face’s mission of safe, beneficial AI adoption. Access will allow us to test advanced vision-language reasoning for real-world document processing use cases.

Sign up or log in to comment