YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Hindi Embedding Foundational Model

This is a multilingual causal language model with a focus on Hindi text generation. The model uses a custom architecture with several advanced features:

  • Mixture of Experts (MoE) for more efficient and scalable parameter usage
  • Rotary Position Embeddings (RoPE) for improved handling of positional information
  • Grouped Query Attention (GQA) for efficient attention computation
  • Language embeddings for multilingual support
  • Initial CNN layer for improved token representation

Model Details

  • Type: Causal Language Model (auto-regressive)
  • Framework: PyTorch (custom architecture)
  • Language Support: Primary focus on Hindi
  • License: Apache 2.0
  • Developed by: ConvaiInnovations

Usage

This model requires custom architecture files for inference. You need to include the following Python modules in your project:

  • convaicausallm_model_with_moe_rope.py: Contains the model architecture
  • hindi_embeddings.py: Contains the SentencePiece tokenizer wrapper

Sample Code

import torch
from convaicausallm_model_with_moe_rope import ConvaiCausalLMConfig, ConvaiCausalLM
from hindi_embeddings import SentencePieceTokenizerWrapper
from safetensors.torch import load_file
import json
# Load model and tokenizer
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")
config_path = "config.json"

with open(config_path, "r") as f:
    config_dict = json.load(f)

config = ConvaiCausalLMConfig(**config_dict)
model = ConvaiCausalLM(config)
state_dict = load_file("model.safetensors")
model.load_state_dict(state_dict)

# Generate text
input_text = "भारत की राजधानी क्या है?"
input_ids = tokenizer.sp_model.EncodeAsIds(input_text)
input_ids_tensor = torch.tensor([input_ids], dtype=torch.long)
lang_id = torch.tensor([0], dtype=torch.long)  # Language ID for Hindi

# Forward pass
outputs = model(input_ids=input_ids_tensor, lang_ids=lang_id, char_ids=None)
next_token_logits = outputs["logits"][:, -1, :]
next_token = torch.argmax(next_token_logits, dim=-1).unsqueeze(-1)

# Continue generation as needed...

See generate_multilingual.py for a complete text generation implementation.

Limitations

This is an early version of the model with the following limitations:

  • Limited contextual knowledge
  • May generate inaccurate or nonsensical information
  • Performance varies depending on input prompt and generation parameters

Acknowledgments

This work builds upon advancements in language model architecture and training techniques from the research community.

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support