🎓 CengBot - Çukurova University Computer Engineering Assistant
Author: naholav
Date: August 2025
A bilingual (Turkish/English) AI assistant specifically trained for Çukurova University Computer Engineering Department students using LLaMA 3.2 3B with LoRA fine-tuning.
📋 Model Details
Developed by: naholav
Model type: Causal Language Model (LLaMA 3.2 3B + LoRA)
Language(s): Turkish, English
License: Apache-2.0
Finetuned from: meta-llama/Llama-3.2-3B
Library: PEFT 0.15.2
Dataset: Naholav/cukurova_university_chatbot
🔗 Project Links
- GitHub Repository: https://github.com/naholav/cengbot
- Model on HuggingFace: https://huggingface.co/Naholav/cengbot-lora-tr-en-cukurova
- Dataset on HuggingFace: https://huggingface.co/datasets/Naholav/cukurova_university_chatbot
🤖 Advanced Inference Script
We provide a carefully optimized standalone inference script (cengbot_inference.py
) that incorporates extensive prompt engineering and parameter tuning developed through iterative testing.
🔧 Script Features
Dual Prompt Engineering:
- Turkish Prompt: Specifically crafted system prompt with strict response guidelines (max 4 sentences)
- English Prompt: Parallel optimized prompt maintaining consistency across languages
- Language Detection: Multi-layer detection using Turkish characters, common words, and langdetect library fallback
Optimized Generation Parameters:
# Carefully tuned through extensive testing
temperature = 0.7 # Balanced creativity vs consistency
max_new_tokens = 200 # Sufficient for detailed answers
repetition_penalty = 1.1 # Prevents repetitive responses
top_p = 0.95 # Nucleus sampling for quality
top_k = 50 # Top-k filtering for coherence
Smart Response Control:
- Greeting Detection: Automatically reduces token usage for simple greetings (30 tokens vs 100)
- Response Limiting: Enforces maximum 4 sentences with intelligent sentence splitting
- Question Limiting: Restricts model to ask maximum 1 question per response
- Language-Specific Processing: Different response extraction logic for Turkish vs English
Prompt Engineering Process: The system prompts were developed through extensive trial-and-error testing to:
- Maintain university-specific context
- Ensure concise but helpful responses
- Balance friendliness with professionalism
- Optimize for educational support scenarios
- Handle both casual questions and technical inquiries
System Prompt Examples:
Turkish System Prompt (Optimized):
Sen Çukurova Üniversitesi Bilgisayar Mühendisliği bölümünün deneyimli dijital asistanısın.
Öğrencilere samimi, yardımsever ve doğru bilgiler vererek destek oluyorsun.
Önemli kurallar:
- Her soruya MAKSİMUM 4 CÜMLE ile yanıt ver
- Cevapların çok kısa, net ve anlaşılır olmalı
- Gereksiz açıklamalardan ve tekrarlardan kesinlikle kaçın
- Selamlama mesajlarına tek cümlelik karşılık ver
- Sadece sorulan soruya odaklan, ekstra bilgi verme
- Cevaplarında MAKSİMUM 1 adet soru sorabilirsin
English System Prompt (Parallel):
You are an experienced digital assistant for Çukurova University Computer Engineering Department.
You help students by providing friendly, helpful and accurate information.
Important rules:
- Answer each question with MAXIMUM 4 SENTENCES
- Keep your answers very short, clear and understandable
- Absolutely avoid unnecessary explanations and repetitions
- Reply to greeting messages with a single sentence
- Focus only on the asked question, don't provide extra information
- You can ask MAXIMUM 1 question in your answers
🚀 How to Get Started
Direct Use:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B")
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Naholav/cengbot-lora-tr-en-cukurova")
# Generate response
inputs = tokenizer("Merhaba, bilgisayar mühendisliği hakkında bilgi verebilir misin?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Simple Setup:
- Install dependencies:
pip install torch transformers peft langdetect
- Download and run:
wget https://raw.githubusercontent.com/naholav/cengbot/main/cengbot_inference.py
python cengbot_inference.py
That's it! The model will be downloaded automatically from HuggingFace.
🎯 Features
- Dual Language Support: Automatically detects Turkish or English input using langdetect
- Dual Prompt System: Separate optimized prompts for Turkish and English responses
- Dual Language Validation: Separate TR/EN validation during training (TR Loss: 0.6750, EN Loss: 0.7387)
- University-Specific: Trained on Çukurova University Computer Engineering data
- Concise Responses: Maximum 4 sentences per response with smart length control
- Smart Greeting Detection: Optimized token usage for greetings vs detailed questions
- Language Detection: Advanced Turkish character and word detection fallbacks
- Easy Setup: Just copy and run the inference script
📋 Model Details
Architecture:
- Base Model: meta-llama/Llama-3.2-3B (3.2B parameters, 28 layers, 3072 hidden size)
- Model Type: LLaMA (torch.bfloat16)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Config: r=16, alpha=32, dropout=0.1
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Specifications:
- Languages: Turkish, English (perfectly balanced bilingual training)
- Domain: Computer Engineering Education (Çukurova University specific)
- Response Style: Concise, helpful, friendly (maximum 4 sentences)
- Prompt Engineering: Dual system prompts with language-specific optimization
- Language Detection: Advanced Turkish character + word detection with langdetect fallback
- Generation Parameters: temp=0.7, top_p=0.95, top_k=50, rep_penalty=1.1
Hardware & Performance:
- Training Hardware: RTX 5090 32GB (90% utilization), 128GB RAM, AMD Ryzen 9 9950X
- Best Validation: 0.6920 overall, 0.6636 TR, 0.7268 EN
- Inference Speed: Optimized for real-time chat with greeting detection
💻 System Requirements
For Inference:
- GPU: CUDA-compatible (8GB+ VRAM recommended)
- RAM: 8GB minimum, 16GB recommended
- Python: 3.8+
- Internet: For initial model download
📖 Usage Examples
Turkish:
👤 You: Merhaba, bilgisayar mühendisliği hakkında bilgi verebilir misin?
🤖 CengBot: Merhaba! Bilgisayar mühendisliği, yazılım geliştirme, sistem tasarımı ve teknoloji çözümleri üzerine odaklanan bir disiplindir. Çukurova Üniversitesi'nde güçlü bir müfredatımız var. Hangi konuda detay istiyorsun?
English:
👤 You: What programming languages should I learn first?
🤖 CengBot: I recommend starting with Python for its simplicity and versatility, then moving to Java or C++ for fundamental concepts. These languages provide a solid foundation for computer engineering. Which area interests you most - web development, systems programming, or data science?
🔧 Training Performance
Dataset Information:
- Total Examples: 22,524 (11,187 Turkish, 11,337 English)
- Training Set: 20,948 examples
- Validation Sets: 783 Turkish, 793 English (7% split each)
Training Progress:
- Total Steps: 1170/1312 (Early stopping activated at step 1170)
- Best Eval Loss: 0.6920 (Step 975) → Model saved from best checkpoint
- Final Train Loss: 0.5104 (continuous improvement)
- Early Stopping: Patience=3, triggered due to eval loss increase
- Training Time: ~3 hours on RTX 5090
Best Model Performance (Step 975):
- Overall Eval Loss: 0.6920
- Turkish Validation Loss: 0.6636
- English Validation Loss: 0.7268
- Average Loss: 0.6952
Training Features:
- Dual Language Evaluation: Separate TR/EN validation tracking every step
- Early Stopping: Patience=3 prevents overfitting, saves best model
- Advanced Optimization: Mixed precision (BF16), gradient checkpointing
- LoRA Configuration: r=16, alpha=32, dropout=0.1, targets all linear layers
- Hardware Optimization: RTX 5090 32GB (90% VRAM usage), 128GB RAM, 16-core CPU
- Memory Management: Pre-cached datasets in RAM, persistent workers
Prompt Engineering & Data Processing:
- Dual Template System: Separate training templates for Turkish and English
- Response Masking: Only LoRA adapter learns from assistant responses, not user questions
- Template Consistency: Identical structure between training and inference prompts
- Balanced Dataset: Perfectly balanced Turkish (11,187) and English (11,337) examples
- Response Quality Control: Training data filtered for concise, educational responses
⚠️ Limitations and Bias
- Domain Specific: Optimized for computer engineering education
- Response Length: Limited to maximum 4 sentences for conciseness
- University Context: Trained specifically on Çukurova University data
- Language Balance: Balanced TR/EN training but may show slight performance differences
🔧 Technical Specifications
Hardware: RTX 5090 32GB, 128GB RAM, AMD Ryzen 9 9950X
Training Time: ~3 hours
Framework: PEFT 0.15.2, Transformers, PyTorch
Optimization: Mixed precision (BF16), gradient checkpointing
📈 Environmental Impact
Hardware Type: NVIDIA RTX 5090
Hours Used: ~3 hours
Compute Region: Local training setup
Carbon Emissions: Minimal due to efficient LoRA training
🤝 Contributing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
📞 Contact
- Issues: https://github.com/naholav/cengbot/issues
- Author: naholav
📚 Citation
@misc{cengbot2025,
title={CengBot: A Bilingual AI Assistant for Computer Engineering Education},
author={naholav},
year={2025},
publisher={HuggingFace},
url={https://huggingface.co/Naholav/cengbot-lora-tr-en-cukurova}
}
Happy learning! 🎓
- Downloads last month
- 21
Model tree for Naholav/cengbot-lora-tr-en-cukurova
Base model
meta-llama/Llama-3.2-3B