YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Hindi Vision-Language Model

This is a multimodal Vision-Language Model for Hindi that combines:

  1. A custom Hindi Language Model
  2. A vision encoder (ViT)
  3. Training on Hindi VQA data

Usage

from fixed_vlm_model import FixedHindiVLM
from hindi_language_model import HindiCausalLM, HindiCausalLMConfig
from hindi_embeddings import SentencePieceTokenizerWrapper

model = FixedHindiVLM.from_pretrained("convaiinnovations/hindi-vlm-model")
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")

Model Details

  • Vision Encoder: ViT (Vision Transformer)
  • Language Model: Custom Hindi Causal LM
  • Training Data: Hindi VQA dataset (damerajee/clean_hin_vqa)
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support