YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Hindi Embedding Foundational Model
This is a multilingual causal language model with a focus on Hindi text generation. The model uses a custom architecture with several advanced features:
- Mixture of Experts (MoE) for more efficient and scalable parameter usage
- Rotary Position Embeddings (RoPE) for improved handling of positional information
- Grouped Query Attention (GQA) for efficient attention computation
- Language embeddings for multilingual support
- Initial CNN layer for improved token representation
Model Details
- Type: Causal Language Model (auto-regressive)
- Framework: PyTorch (custom architecture)
- Language Support: Primary focus on Hindi
- License: Apache 2.0
- Developed by: ConvaiInnovations
Usage
This model requires custom architecture files for inference. You need to include the following Python modules in your project:
convaicausallm_model_with_moe_rope.py
: Contains the model architecturehindi_embeddings.py
: Contains the SentencePiece tokenizer wrapper
Sample Code
import torch
from convaicausallm_model_with_moe_rope import ConvaiCausalLMConfig, ConvaiCausalLM
from hindi_embeddings import SentencePieceTokenizerWrapper
from safetensors.torch import load_file
import json
# Load model and tokenizer
tokenizer = SentencePieceTokenizerWrapper("tokenizer.model")
config_path = "config.json"
with open(config_path, "r") as f:
config_dict = json.load(f)
config = ConvaiCausalLMConfig(**config_dict)
model = ConvaiCausalLM(config)
state_dict = load_file("model.safetensors")
model.load_state_dict(state_dict)
# Generate text
input_text = "भारत की राजधानी क्या है?"
input_ids = tokenizer.sp_model.EncodeAsIds(input_text)
input_ids_tensor = torch.tensor([input_ids], dtype=torch.long)
lang_id = torch.tensor([0], dtype=torch.long) # Language ID for Hindi
# Forward pass
outputs = model(input_ids=input_ids_tensor, lang_ids=lang_id, char_ids=None)
next_token_logits = outputs["logits"][:, -1, :]
next_token = torch.argmax(next_token_logits, dim=-1).unsqueeze(-1)
# Continue generation as needed...
See generate_multilingual.py
for a complete text generation implementation.
Limitations
This is an early version of the model with the following limitations:
- Limited contextual knowledge
- May generate inaccurate or nonsensical information
- Performance varies depending on input prompt and generation parameters
Acknowledgments
This work builds upon advancements in language model architecture and training techniques from the research community.
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support