Mixture-of-Experts Hindi Language Model (Base-1)
This is a Hindi language model that uses a Mixture-of-Experts (MoE) architecture. The model was trained on a Hindi text corpus.
Model Details
- Model Type: Causal Language Model with Mixture-of-Experts
- Language: Hindi
- License: MIT
- Training Data: Hindi corpus
- Model Size: ~1.8GB
- Training Steps: 7000
- Format: SafeTensors
Usage
This model uses a custom implementation and is not directly compatible with Hugging Face's AutoModelForCausalLM and AutoTokenizer. The simplest way to use it is with the provided generate.py
script:
# Install safetensors if you don't have it
pip install safetensors
# Run the script with your Hindi text prompt
python generate.py "आज का दिन बहुत अच्छा है"
For more advanced usage, you can integrate the model into your own code:
# Import the required modules
import os
from hindi_embeddings import SentencePieceTokenizerWrapper
from convaicausallm_model_with_moe_rope import ConvaiCausalLMConfig, ConvaiCausalLM
import torch
import json
from safetensors.torch import load_file
# Load the config
with open("config.json", "r") as f:
config_dict = json.load(f)
config = ConvaiCausalLMConfig(**config_dict)
# Load the tokenizer
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")
# Load the model (supports both SafeTensors and PyTorch formats)
model = ConvaiCausalLM(config)
# Check which format is available and load accordingly
if os.path.exists("model.safetensors"):
model.load_state_dict(load_file("model.safetensors"))
else:
model.load_state_dict(torch.load("pytorch_model.bin", map_location="cpu"))
model.eval()
# Generate text
prompt = "आज का दिन बहुत अच्छा है"
input_ids = tokenizer.sp_model.EncodeAsIds(prompt)
input_tensor = torch.tensor([input_ids], dtype=torch.long)
with torch.no_grad():
output_ids = model.generate(
input_ids=input_tensor,
max_length=50,
temperature=0.7,
top_k=10,
top_p=0.9,
do_sample=True
)
# Print the generated text
generated_text = tokenizer.sp_model.DecodeIds(output_ids[0].tolist())
print(generated_text)
Generation Parameters
For best results with this model, we recommend:
- temperature: 0.2-0.7 (lower is more conservative)
- top_k: 5-10 (lower gives more predictable outputs)
- top_p: 0.85-0.92
- max_length: 50-100 tokens
- repetition_penalty: 1.1-1.2 (helps prevent repetition)
Model Architecture
The model features:
- Mixture-of-Experts (MoE) layers
- RoPE (Rotary Position Embeddings)
- Grouped Query Attention (GQA)
Limitations
This is an early experimental version of the model with limited training steps. It may not generate fluent or coherent Hindi text yet. The model hasn't been fully trained (only 7000 steps), so expect the quality to be limited.
Files Included
model.safetensors
: The model weights in SafeTensors formatconfig.json
: Model configurationtokenizer.model
: SentencePiece tokenizer modelhindi_embeddings.py
: Tokenizer implementationconvaicausallm_model_with_moe_rope.py
: Model implementationgenerate.py
: Example inference script
Training
Training was performed using a custom PyTorch implementation with mixed precision and data parallelism.
Citation
If you use this model, please cite:
@misc{convaiinnovations2025hindi,
author = {ConvAI Innovations},
title = {Mixture-of-Experts Hindi Language Model},
year = {2025},
publisher = {HuggingFace},
howpublished = {https://huggingface.co/convaiinnovations/moe-hindi-fm-base-1}
}
- Downloads last month
- 3