Key1111/qwen2.5-7b-vietnamese-enhanced

Merged Qwen2.5-7B-Instruct model fine-tuned for Vietnamese language tasks using LoRA (Low-Rank Adaptation).

Model Details

Base Model: Qwen/Qwen2.5-7B-Instruct
Fine-tuning Method: LoRA (merged with base model)
Language: Vietnamese, English
Training Data: Alpaca + ViQuAD datasets
Model Size: ~7B parameters
Context Length: 2048 tokens

Usage

Direct Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the merged model directly
model = AutoModelForCausalLM.from_pretrained("Key1111/qwen2.5-7b-vietnamese-enhanced", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Key1111/qwen2.5-7b-vietnamese-enhanced", trust_remote_code=True)

# Generate text
prompt = "Xin chào! Bạn có thể giúp tôi không?"
messages = [{"role": "user", "content": prompt}]

# Apply chat template
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using PEFT (if you have LoRA adapter)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Key1111/qwen2.5-7b-vietnamese-enhanced")

# Use the model
messages = [{"role": "user", "content": "Xin chào, bạn có thể giúp tôi không?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

In n8n

Use this model directly in Hugging Face Inference nodes:

Model: Key1111/qwen2.5-7b-vietnamese-enhanced
No additional configuration needed

Training Details

Base Model: Qwen/Qwen2.5-7B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 32
LoRA Alpha: 16
Learning Rate: 2e-4
Batch Size: 2
Gradient Accumulation Steps: 8
Max Sequence Length: 2048
Training Data:
- Alpaca format dataset (Vietnamese instructions)
- ViQuAD dataset (Vietnamese question-answering)
Total Training Samples: 5000
Training Epochs: 2
Optimizer: AdamW
Scheduler: Linear warmup

Model Performance

This model has been fine-tuned specifically for Vietnamese language tasks and should perform well on:

Vietnamese instruction following
Vietnamese question answering
Vietnamese text generation
Bilingual (Vietnamese-English) conversations

Limitations

The model may still have limitations in understanding complex Vietnamese contexts
Performance may vary depending on the specific task and domain
The model inherits limitations from the base Qwen2.5-7B-Instruct model

License

This model is licensed under Apache 2.0.

Citation

If you use this model in your research, please cite:

@misc{Key1111_qwen2.5_7b_vietnamese_enhanced,
  title={Key1111/qwen2.5-7b-vietnamese-enhanced},
  author={Key1111},
  year={2024},
  publisher={Hugging Face},
  journal={Hugging Face Hub},
  howpublished={\url{https://huggingface.co/Key1111/qwen2.5-7b-vietnamese-enhanced}},
}