Hindi Embedding Foundational Model

This is a multilingual causal language model with a focus on Hindi text generation. The model uses a custom architecture with several advanced features:

Mixture of Experts (MoE) for more efficient and scalable parameter usage
Rotary Position Embeddings (RoPE) for improved handling of positional information
Grouped Query Attention (GQA) for efficient attention computation
Language embeddings for multilingual support
Initial CNN layer for improved token representation

Model Details

Type: Causal Language Model (auto-regressive)
Framework: PyTorch (custom architecture)
Language Support: Primary focus on Hindi
License: Apache 2.0
Developed by: ConvaiInnovations

Usage

This model requires custom architecture files for inference. You need to include the following Python modules in your project:

convaicausallm_model_with_moe_rope.py: Contains the model architecture
hindi_embeddings.py: Contains the SentencePiece tokenizer wrapper

Sample Code

import torch
from convaicausallm_model_with_moe_rope import ConvaiCausalLMConfig, ConvaiCausalLM
from hindi_embeddings import SentencePieceTokenizerWrapper
from safetensors.torch import load_file
import json
# Load model and tokenizer
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")
config_path = "config.json"

with open(config_path, "r") as f:
    config_dict = json.load(f)

config = ConvaiCausalLMConfig(**config_dict)
model = ConvaiCausalLM(config)
state_dict = load_file("model.safetensors")
model.load_state_dict(state_dict)

# Generate text
input_text = "भारत की राजधानी क्या है?"
input_ids = tokenizer.sp_model.EncodeAsIds(input_text)
input_ids_tensor = torch.tensor([input_ids], dtype=torch.long)
lang_id = torch.tensor([0], dtype=torch.long)  # Language ID for Hindi

# Forward pass
outputs = model(input_ids=input_ids_tensor, lang_ids=lang_id, char_ids=None)
next_token_logits = outputs["logits"][:, -1, :]
next_token = torch.argmax(next_token_logits, dim=-1).unsqueeze(-1)

# Continue generation as needed...

See generate_multilingual.py for a complete text generation implementation.

Limitations

This is an early version of the model with the following limitations:

Limited contextual knowledge
May generate inaccurate or nonsensical information
Performance varies depending on input prompt and generation parameters

Acknowledgments

This work builds upon advancements in language model architecture and training techniques from the research community.