YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Hindi Vision-Language Model
This is a multimodal Vision-Language Model for Hindi that combines:
- A custom Hindi Language Model
- A vision encoder (ViT)
- Training on Hindi VQA data
Usage
from fixed_vlm_model import FixedHindiVLM
from hindi_language_model import HindiCausalLM, HindiCausalLMConfig
from hindi_embeddings import SentencePieceTokenizerWrapper
model = FixedHindiVLM.from_pretrained("convaiinnovations/hindi-vlm-model")
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")
Model Details
- Vision Encoder: ViT (Vision Transformer)
- Language Model: Custom Hindi Causal LM
- Training Data: Hindi VQA dataset (damerajee/clean_hin_vqa)
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support