DheeGPT-Qwen3
Collection
DheeGPT-Qwen3 – 2B multilingual AI models by DheeYantra for natural conversations in 8 Indian languages.
•
16 items
•
Updated
Dheegpt-Qwen3-Malayalam is a large language model designed for high-quality natural language understanding and generation in Malayalam. It is based on the Qwen3 architecture and optimized for both dialogue and reasoning tasks.
The model supports fluent conversational responses and reasoning-style outputs, making it suitable for applications like chatbots, virtual assistants, and step-by-step question answering.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "dheeyantra/dheegpt-qwen3-malayalam"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
# Regular conversation
prompt = "നമസ്കാരം! ഇന്നത്തെ കാലാവസ്ഥ എങ്ങനെയാണ്?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
To serve this model using vLLM, ensure the following:
GPU with compute capability ≥ 8.0 (e.g., NVIDIA A100).
PyTorch 2.1+ with CUDA toolkit installed.
For Tesla V100 (sm70), vLLM GPU inference is not supported; CPU-only fallback is possible but slow.
Python dependencies:
pip install torch transformers vllm sentencepiece
Example vLLM command:
vllm serve \
--model dheeyantra/dheegpt-qwen3-malayalam \
--host 0.0.0.0 \
--port 8000
Released under the Apache 2.0 License.