Key1111/qwen2.5-7b-vietnamese-enhanced

Merged Qwen2.5-7B-Instruct model fine-tuned for Vietnamese language tasks using LoRA (Low-Rank Adaptation).

Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Fine-tuning Method: LoRA (merged with base model)
  • Language: Vietnamese, English
  • Training Data: Alpaca + ViQuAD datasets
  • Model Size: ~7B parameters
  • Context Length: 2048 tokens

Usage

Direct Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the merged model directly
model = AutoModelForCausalLM.from_pretrained("Key1111/qwen2.5-7b-vietnamese-enhanced", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Key1111/qwen2.5-7b-vietnamese-enhanced", trust_remote_code=True)

# Generate text
prompt = "Xin chào! Bạn có thể giúp tôi không?"
messages = [{"role": "user", "content": prompt}]

# Apply chat template
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using PEFT (if you have LoRA adapter)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Key1111/qwen2.5-7b-vietnamese-enhanced")

# Use the model
messages = [{"role": "user", "content": "Xin chào, bạn có thể giúp tôi không?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

In n8n

Use this model directly in Hugging Face Inference nodes:

  • Model: Key1111/qwen2.5-7b-vietnamese-enhanced
  • No additional configuration needed

Training Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • LoRA Rank: 32
  • LoRA Alpha: 16
  • Learning Rate: 2e-4
  • Batch Size: 2
  • Gradient Accumulation Steps: 8
  • Max Sequence Length: 2048
  • Training Data:
    • Alpaca format dataset (Vietnamese instructions)
    • ViQuAD dataset (Vietnamese question-answering)
  • Total Training Samples: 5000
  • Training Epochs: 2
  • Optimizer: AdamW
  • Scheduler: Linear warmup

Model Performance

This model has been fine-tuned specifically for Vietnamese language tasks and should perform well on:

  • Vietnamese instruction following
  • Vietnamese question answering
  • Vietnamese text generation
  • Bilingual (Vietnamese-English) conversations

Limitations

  • The model may still have limitations in understanding complex Vietnamese contexts
  • Performance may vary depending on the specific task and domain
  • The model inherits limitations from the base Qwen2.5-7B-Instruct model

License

This model is licensed under Apache 2.0.

Citation

If you use this model in your research, please cite:

@misc{Key1111_qwen2.5_7b_vietnamese_enhanced,
  title={Key1111/qwen2.5-7b-vietnamese-enhanced},
  author={Key1111},
  year={2024},
  publisher={Hugging Face},
  journal={Hugging Face Hub},
  howpublished={\url{https://huggingface.co/Key1111/qwen2.5-7b-vietnamese-enhanced}},
}
Downloads last month
34
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support