Model Card for Phi-4 LoRA Fine-tuned Model

This model is a LoRA fine-tuned version of Microsoft's Phi-4-mini-instruct, optimized for improved code review using GitHub data.

Model Details

Model Description

This is a fine-tuned version of Microsoft's Phi-4-mini-instruct model using LoRA (Low-Rank Adaptation) technique. The model has been trained on 10k instruction-response pairs to enhance its ability to follow instructions and generate high-quality responses across various tasks.

The model uses 4-bit quantization with NF4 for efficient inference while maintaining performance quality. It's designed to be a lightweight yet capable language model suitable for various text generation tasks.

Developed by: Milos Kotlar
Model type: Causal Language Model
Language(s) (NLP): English
License: MIT
Finetuned from model: microsoft/Phi-4-mini-instruct

Model Sources

Repository: https://github.com/kotlarmilos/phi4-finetuned
Demo: https://huggingface.co/spaces/kotlarmilos/dotnet-runtime

Uses

Direct Use

The model is designed for:

Instruction Following: Generate responses to user instructions and queries
Conversational AI: Engage in multi-turn conversations
Task Completion: Help with various text-based tasks like summarization, explanation, and creative writing
Educational Support: Provide explanations and assistance for learning

Downstream Use

The model can be integrated into:

Chatbot Applications: Web applications, mobile apps, and customer service systems
Content Generation Tools: Writing assistants and creative content platforms
Educational Platforms: Tutoring systems and interactive learning environments
API Services: Text generation services and intelligent automation workflows

Out-of-Scope Use

The model is not intended for:

Factual Information Retrieval: May generate plausible but incorrect information
Professional Medical/Legal Advice: Not qualified for specialized professional guidance
Real-time Critical Systems: Not suitable for safety-critical applications
Harmful Content Generation: Should not be used to create misleading, harmful, or malicious content

How to Get Started with the Model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# Load base model with quantization
base_model = "kotlarmilos/Phi-4-mini-instruct"
lora_path = "artifacts/phi4-finetuned"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True)

base = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base, lora_path)

# Generate text
def generate(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=True, 
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
prompt = "Review the following code changes:"
response = generate(prompt)
print(response)

Training Details

Training Data

The model was fine-tuned on approximately 10,000 high-quality instruction-response pairs designed to improve the model's ability to follow instructions and generate helpful, accurate responses across various domains.

Data Characteristics:

Size: ~10,000 instruction-response pairs
Format: Structured instruction-following conversations
Coverage: Diverse topics and instruction types

Training Procedure

Preprocessing

Data Preparation: Instruction-response pairs formatted for causal language modeling
Tokenization: Text processed using Phi-4's tokenizer with appropriate special tokens
Sequence Formatting: Proper formatting for instruction-following tasks
Quality Filtering: Removal of low-quality or potentially harmful content

Training Hyperparameters

LoRA Configuration:

LoRA Rank (r): 8
LoRA Alpha: 16
LoRA Dropout: 0.05
Target Modules: ["qkv_proj", "gate_up_proj"]
Task Type: CAUSAL_LM

Training Setup:

Base Model: microsoft/Phi-4-mini-instruct
Training Method: LoRA (Low-Rank Adaptation)
Quantization: 4-bit NF4 with BitsAndBytes
Training regime: Mixed precision training with appropriate optimization

Usage Examples

If you use this model, please refer to https://github.com/kotlarmilos/phi4-finetuned

Downloads last month: 4

Model tree for kotlarmilos/dotnet-runtime

Base model

microsoft/Phi-4-mini-instruct

Adapter

(103)

this model