Training Data
- jaeyong2/Qwen3-06B-Ko-DPO
- jaeyong2/Qwen3-06B-Ko-DPO-2
- jaeyong2/Qwen3-06B-Ko-DPO-3
- jaeyong2/Qwen3-06B-En-DPO-2
Evaluation
!lm_eval --model hf \
--model_args pretrained=jaeyong2/Qwen3-0.6B-DPO \
--tasks kmmlu,mmlu,gsm8k \
--device cuda:0 \
--batch_size 1 \
--num_fewshot 5
(5-shot) | Qwen3-0.6B-DPO | Qwen3-0.6B | naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B |
---|---|---|---|
MMLU | 0.47 | 0.47 | 0.44 |
KMMLU | 0.34 | 0.35 | 0.38 |
GSM8K | 0.47 | 0.42 | 0.39 |
License
- Qwen/Qwen3-0.6B : https://choosealicense.com/licenses/apache-2.0/
Acknowledgement
This research is supported by TPU Research Cloud program.
- Downloads last month
- 156
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support