Lughaat-1.0-8B-Instruct

Overview

Lughaat-1.0-8B-Instruct is an Urdu language model developed by Muhammad Noman, built on the architecture of Llama 3.1 8B. This model is specifically trained on muhammadnoman76/lughaat-urdu-dataset-llm, the largest Urdu dataset compiled by Muhammad Noman, enabling it to outperform competitors like Qwen-2.5-7b, Mistral 7B, and Alif 8B models in Urdu language tasks.

Model Details

Model Name: Lughaat-1.0-8B-Instruct
Architecture: Based on Llama 3.1 8B
Developer: Muhammad Noman
Language: Urdu
Training Dataset: muhammadnoman76/lughaat-urdu-dataset-llm
Contact:
- Email: muhammadnomanshafiq76@gmail.com
- LinkedIn: https://www.linkedin.com/in/muhammad-noman76/

Installation & Usage

This model is available on Hugging Face and can be used in multiple ways:

Method 1: Using Unsloth for Optimized Inference

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "muhammadnoman76/Lughaat-1.0-8B-Instruct", 
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)

# Define the prompt template for Urdu instructions
lughaat_prompt = """نیچے ایک ہدایت ہے جو کسی کام کی تفصیل بیان کرتی ہے، جس کے ساتھ ایک ان پٹ دیا گیا ہے جو مزید سندات فراہم کرتا ہے۔ تھوڑا وقت لیکر ایک جواب لکھیں جو درست طریقے سے درخواست مکمل کریں
### Instruction:
{}
### Input:
{}
### Response:
{}"""

# Example usage
inputs = tokenizer(
[
    lughaat_prompt.format(
        "قائد اعظم کون ہے؟", 
        "", 
        "", 
    )
], return_tensors = "pt").to("cuda")

# Generate response with streaming
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
outputs = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)

Method 2: Using Hugging Face Pipeline

from transformers import pipeline

pipe = pipeline("text-generation", model="muhammadnoman76/Lughaat-1.0-8B-Instruct")
result = pipe("نیچے ایک ہدایت ہے جو کسی کام کی تفصیل بیان کرتی ہے، جس کے ساتھ ایک ان پٹ دیا گیا ہے جو مزید سندات فراہم کرتا ہے۔ تھوڑا وقت لیکر ایک جواب لکھیں جو درست طریقے سے درخواست مکمل کریں\n### Instruction: قائد اعظم کون ہے؟\n### Input:\n### Response:")

Method 3: Direct Loading with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("muhammadnoman76/Lughaat-1.0-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("muhammadnoman76/Lughaat-1.0-8B-Instruct")

# Process input
prompt = """نیچے ایک ہدایت ہے جو کسی کام کی تفصیل بیان کرتی ہے، جس کے ساتھ ایک ان پٹ دیا گیا ہے جو مزید سندات فراہم کرتا ہے۔ تھوڑا وقت لیکر ایک جواب لکھیں جو درست طریقے سے درخواست مکمل کریں
### Instruction:
قائد اعظم کون ہے؟
### Input:

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Prompt Format

For optimal results, use the following prompt format:

نیچے ایک ہدایت ہے جو کسی کام کی تفصیل بیان کرتی ہے، جس کے ساتھ ایک ان پٹ دیا گیا ہے جو مزید سندات فراہم کرتا ہے۔ تھوڑا وقت لیکر ایک جواب لکھیں جو درست طریقے سے درخواست مکمل کریں
### Instruction:
[Your instruction in Urdu]
### Input:
[Additional context or input - can be empty]
### Response:

Model Capabilities

Lughaat-1.0-8B-Instruct is specifically designed for Urdu language processing tasks including:

Question answering
Text generation
Summarization
Translation
Content creation
Conversational AI in Urdu

Hardware Requirements

For optimal performance, a CUDA-compatible GPU is recommended
Minimum of 16GB VRAM for full precision inference
8GB VRAM when using 4-bit quantization

Performance Benchmarks