Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database should be the hh rlhf dataset.
AI & ML interests
None defined yet.
Recent Activity
models
19
AIPlans/Qwen3-0.6B-RM-hs2
Text Classification
•
0.6B
•
Updated
•
78
•
1
AIPlans/Qwen3-0.6B-SFT-hs2
Text Generation
•
0.6B
•
Updated
•
43
AIPlans/Qwen3-0.6B-ORPO
Text Generation
•
Updated
•
32
AIPlans/Qwen3-0.6B-DPO_NOTLORA
Text Generation
•
0.6B
•
Updated
•
30
AIPlans/Qwen3-0.6B-KTO
Text Generation
•
Updated
•
32
•
1
AIPlans/Qwen3-0.6B-DPO
Text Generation
•
Updated
•
27
AIPlans/qwen3-0.6b-hh-rlhf-sft
0.6B
•
Updated
•
19
AIPlans/Qwen3-0.6B-KTO_trial
Text Generation
•
0.6B
•
Updated
•
21
•
1
AIPlans/qwen3-0.6b-sft-hh-rlhf-lora
Updated
AIPlans/qwen3-0.6b-base-PPO-PM
Updated
•
1
datasets
16
AIPlans/helpsteer2-helpfulness-preference-cleaned
Viewer
•
Updated
•
6.99k
•
42
AIPlans/trackio-experiments
Updated
•
15
AIPlans/ultrafeedback_binarized_chinese
Viewer
•
Updated
•
14k
•
15
AIPlans/ultrafeedback_binarized
Viewer
•
Updated
•
14k
•
13
AIPlans/FilteredPKU-SafeRLHF_chinese
Viewer
•
Updated
•
12k
•
16
AIPlans/FilteredPKU-SafeRLHF
Viewer
•
Updated
•
12k
•
26
AIPlans/SafetyBench_WithLabels_Better_chinese
Viewer
•
Updated
•
546
•
26
AIPlans/SafetyBench_WithLabels
Viewer
•
Updated
•
546
•
14
AIPlans/ToxiGen_chinese
Viewer
•
Updated
•
1k
•
14
AIPlans/ToxiGen
Viewer
•
Updated
•
1k
•
12