Malaysian Finetuned Instruct LoRA
Collection
Continue finetuning Instruct model using LoRA from 0.5B up to 72B.
•
16 items
•
Updated
Continue finetuning https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct on highly curated 1.5B tokens Malaysian instruction dataset.
Finetune on mesolitica/Malaysian-SFT to make the model understand Malaysian context.
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head"]
.Source code at https://github.com/mesolitica/malaya/tree/master/session/qwen2.5
Based on 0-shot official MalayMMLU First token accuracy,
Model Accuracy shot by_letter category
0 Malaysian-Qwen2.5-1.5B-Instruct 57.347524 0shot True STEM
1 Malaysian-Qwen2.5-1.5B-Instruct 61.084606 0shot True Language
2 Malaysian-Qwen2.5-1.5B-Instruct 55.854293 0shot True Social science
3 Malaysian-Qwen2.5-1.5B-Instruct 54.017750 0shot True Others
4 Malaysian-Qwen2.5-1.5B-Instruct 56.336746 0shot True Humanities
{'Social science': 6918, 'Language': 6288, 'Humanities': 4395, 'Others': 4169, 'STEM': 2443}
Model : Malaysian-Qwen2.5-1.5B-Instruct
Metric : first
Shot : 0shot
average accuracy 57.13459711725106
accuracy for STEM 57.34752353663528
accuracy for Language 61.084605597964384
accuracy for Social science 55.85429314830876
accuracy for Others 54.01775005996642
accuracy for Humanities 56.33674630261661
While the original model,
Model Accuracy shot by_letter category
0 Qwen2.5-1.5B-Instruct 57.306590 0shot True STEM
1 Qwen2.5-1.5B-Instruct 52.862595 0shot True Language
2 Qwen2.5-1.5B-Instruct 51.633420 0shot True Social science
3 Qwen2.5-1.5B-Instruct 52.554569 0shot True Others
4 Qwen2.5-1.5B-Instruct 57.224118 0shot True Humanities
{'Social science': 6918, 'Language': 6288, 'Humanities': 4395, 'Others': 4169, 'STEM': 2443}
Model : Qwen2.5-1.5B-Instruct
Metric : first
Shot : 0shot
average accuracy 53.69842646512204
accuracy for STEM 57.306590257879655
accuracy for Language 52.862595419847324
accuracy for Social science 51.633420063602195
accuracy for Others 52.554569441112974
accuracy for Humanities 57.22411831626849
Based on 0-shot exact first token match using vLLM Guided Decoding,
Model Accuracy shot category
0 Malaysian-Qwen2.5-1.5B-Instruct 52.517397 0 STEM
1 Malaysian-Qwen2.5-1.5B-Instruct 54.834606 0 Language
2 Malaysian-Qwen2.5-1.5B-Instruct 50.650477 0 Social science
3 Malaysian-Qwen2.5-1.5B-Instruct 48.380907 0 Others
4 Malaysian-Qwen2.5-1.5B-Instruct 50.693970 0 Humanities
Model : Malaysian-Qwen2.5-1.5B-Instruct
Metric : full
Shot : 0
average accuracy 51.542559781935324
accuracy for STEM 52.51739664347114
accuracy for Language 54.834605597964384
accuracy for Social science 50.65047701647876
accuracy for Others 48.38090669225234
accuracy for Humanities 50.69397042093288
While the original model,
Model Accuracy shot category
0 Qwen2.5-1.5B-Instruct 54.809660 0 STEM
1 Qwen2.5-1.5B-Instruct 53.101145 0 Language
2 Qwen2.5-1.5B-Instruct 51.387684 0 Social science
3 Qwen2.5-1.5B-Instruct 51.403214 0 Others
4 Qwen2.5-1.5B-Instruct 55.472127 0 Humanities
Model : Qwen2.5-1.5B-Instruct
Metric : full
Shot : 0
average accuracy 52.92198405815058
accuracy for STEM 54.80966025378633
accuracy for Language 53.10114503816794
accuracy for Social science 51.38768430182134
accuracy for Others 51.40321420004798
accuracy for Humanities 55.472127417519914
Special thanks to https://www.sns.com.my for 8x H100 node!