--- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B tags: - instruction-following - conversational-ai - lora - alpaca - 4bit - intruct license: apache-2.0 datasets: - tatsu-lab/alpaca language: - en --- # DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for instruction-following tasks using LoRA on the Alpaca dataset. ## Overview - **Base Model:** deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B (1.5B parameters) - **Fine-tuning Method:** LoRA (4-bit quantization) - **Dataset:** Alpaca instruction dataset (52K samples) - **Training:** 3 epochs with optimized hyperparameters ## Key Features - Improved instruction following capabilities - Conversational AI for question answering - Memory efficient training with LoRA - Production-ready merged model ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct") tokenizer = AutoTokenizer.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct") # Example prompt = "Human: What is machine learning?\n\nAssistant:" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details - LoRA rank: 8, alpha: 16 - 4-bit NF4 quantization with bfloat16 - Learning rate: 1e-4 with cosine scheduling - Batch size: 8, Max length: 512 tokens Trained for efficient deployment in production environments.