Malaysian Finetuned Instruct LoRA
Collection
Continue finetuning Instruct model using LoRA from 0.5B up to 72B.
•
16 items
•
Updated
Continue finetuning https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct on highly curated 1.5B tokens Malaysian instruction dataset.
Finetune on mesolitica/Malaysian-SFT to make the model understand Malaysian context.
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head"]
.Source code at https://github.com/mesolitica/malaya/tree/master/session/llama3
Based on 0-shot official MalayMMLU First token accuracy,
Model Accuracy shot by_letter category
0 Malaysian-Llama-3.1-70B-Instruct 75.890299 0shot True STEM
1 Malaysian-Llama-3.1-70B-Instruct 75.540712 0shot True Language
2 Malaysian-Llama-3.1-70B-Instruct 72.260769 0shot True Social science
3 Malaysian-Llama-3.1-70B-Instruct 71.863756 0shot True Others
4 Malaysian-Llama-3.1-70B-Instruct 78.202503 0shot True Humanities
{'Social science': 6918, 'Language': 6288, 'Humanities': 4395, 'Others': 4169, 'STEM': 2443}
Model : Malaysian-Llama-3.1-70B-Instruct
Metric : first
Shot : 0shot
average accuracy 74.48891091562383
accuracy for STEM 75.89029881293492
accuracy for Language 75.54071246819338
accuracy for Social science 72.26076900838393
accuracy for Others 71.86375629647398
accuracy for Humanities 78.20250284414107
While the original model,
Model Accuracy shot by_letter category
0 Llama-3.1-70B-Instruct 78.919361 0shot True STEM
1 Llama-3.1-70B-Instruct 78.769084 0shot True Language
2 Llama-3.1-70B-Instruct 77.262215 0shot True Social science
3 Llama-3.1-70B-Instruct 75.269849 0shot True Others
4 Llama-3.1-70B-Instruct 82.571104 0shot True Humanities
{'Social science': 6918, 'Language': 6288, 'Humanities': 4395, 'Others': 4169, 'STEM': 2443}
Model : Llama-3.1-70B-Instruct
Metric : first
Shot : 0shot
average accuracy 78.44133316813281
accuracy for STEM 78.9193614408514
accuracy for Language 78.76908396946564
accuracy for Social science 77.26221451286499
accuracy for Others 75.26984888462461
accuracy for Humanities 82.57110352673493
Based on 0-shot exact first token match using vLLM Guided Decoding,
Model Accuracy shot category
0 Malaysian-Llama-3.1-70B-Instruct 68.686042 0 STEM
1 Malaysian-Llama-3.1-70B-Instruct 69.354326 0 Language
2 Malaysian-Llama-3.1-70B-Instruct 67.620700 0 Social science
3 Malaysian-Llama-3.1-70B-Instruct 65.915088 0 Others
4 Malaysian-Llama-3.1-70B-Instruct 69.897611 0 Humanities
Model : Malaysian-Llama-3.1-70B-Instruct
Metric : full
Shot : 0
average accuracy 68.29802172386735
accuracy for STEM 68.68604175194433
accuracy for Language 69.35432569974554
accuracy for Social science 67.62069962416884
accuracy for Others 65.91508755097145
accuracy for Humanities 69.89761092150171
While the original model,
Model Accuracy shot category
0 Llama-3.1-70B-Instruct 76.668031 0 STEM
1 Llama-3.1-70B-Instruct 77.162850 0 Language
2 Llama-3.1-70B-Instruct 74.906042 0 Social science
3 Llama-3.1-70B-Instruct 72.655313 0 Others
4 Llama-3.1-70B-Instruct 78.930603 0 Humanities
Model : Llama-3.1-70B-Instruct
Metric : full
Shot : 0
average accuracy 76.01288563994548
accuracy for STEM 76.66803110929186
accuracy for Language 77.16284987277355
accuracy for Social science 74.90604220873085
accuracy for Others 72.65531302470617
accuracy for Humanities 78.93060295790671
Special thanks to https://www.sns.com.my for 8x H100 node!