Phi-4-mini-Vietnamese-Instruct

Model Description

  • Base Model: microsoft/phi-4-mini-instruct
  • Finetuning Technique: Low-Rank Adaptation (LoRA)
  • Quantization: 4-bit NF4 using bitsandbytes
  • Purpose: To create a powerful yet lightweight model capable of understanding and generating high-quality Vietnamese text.

The LoRA weights were merged into the base model, and the resulting model was quantized to optimize for performance and reduce memory footprint, making it suitable for deployment on consumer-grade hardware.

How to Use

As the LoRA weights have been merged, you can use this model directly with the transformers library without needing the peft library for inference.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "your-username/your-repo-name" 

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16, # Use bfloat16 for faster inference
    trust_remote_code=True 
)

# Create a prompt using the chat template
# This is the recommended way for instruction-tuned models
messages = [
    {"role": "user", "content": "Hãy viết một đoạn văn ngắn giải thích về Lượng tử hóa trong AI."},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate text
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    eos_token_id=tokenizer.eos_token_id,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Print the generated text, extracting only the assistant's response
print(response.split("<|assistant|>")[1].strip())

Finetuning Details

The model was fine-tuned on the 5CD-AI/Vietnamese-nampdn-ai-tiny-webtext-gg-translated dataset.

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Training Epochs: 1
  • Number of Sample: 500000

Evaluation results

The model's performance was evaluated on the Vietnamese Machine Learning Understanding (VMLU) benchmark.

Model Social Science Stem Humanities Others Avg
Phi-4 mini Vietnamese 40.85 48 42.06 43.31 42.84
Downloads last month
1
Safetensors
Model size
2.28B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train dinhnhat241103/Phi-4-mini-instruct-Vi