Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
AIPlans
's Collections
Post Training Versions - Qwen 0.6B
Red Teaming Alignment Evals
Model Diffing
Model Diffing
updated
12 days ago
Upvote
1
AIPlans/qwen3-8b-dpo-hh-rlhf
Updated
Jul 4
AIPlans/qwen3-8b-ipo-hh-rlhf
Text Generation
•
Updated
Jul 17
•
7
AIPlans/dpo_qwen0_6b_fft
0.6B
•
Updated
Sep 24
•
1
AIPlans/qwen3-0.6b-dpo-lora
Text Generation
•
0.6B
•
Updated
Sep 18
•
13
•
1
AIPlans/qwen3-0.6B-reward-hh-rlhf
Text Generation
•
0.6B
•
Updated
Sep 13
•
11
Note
just the reward model
AIPlans/qwen3-0.6b-base-PPO-PM
Updated
Sep 27
•
1
Upvote
1
Share collection
View history
Collection guide
Browse collections