Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
lblaoke
's Collections
Preference Data
Draft Models
Yifan's PPO Models
Yifan's RMs
Draft Models
updated
May 12
Upvote
-
lblaoke/qwama-0.5b-skywork-pref-dpo-llama-factory-v1
Updated
Mar 19
•
7
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v1
Updated
Mar 19
•
5
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v2
Updated
Mar 21
•
25
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-trl-v3
Updated
Mar 28
•
4
lblaoke/qwama-0.5b-skywork-pref-sft-chosen-trl-v3
Updated
Mar 28
•
38
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-chosen-trl-v3
Updated
Mar 28
•
6
lblaoke/qwama-0.5b-skywork-pref-sft-chosen-dpo-trl-v3
Updated
Mar 28
•
5
lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4
Updated
Apr 8
•
7
lblaoke/opt-125m-hh-rlhf-dpo-trl-v5
Updated
May 8
•
18
lblaoke/opt-125m-hh-rlhf-chosen-sft-trl-v5
Updated
May 7
•
21
lblaoke/opt-350m-hh-rlhf-chosen-sft-trl-v5
Updated
May 11
•
22
lblaoke/opt-350m-hh-rlhf-dpo-trl-v5
Updated
May 12
•
50
Upvote
-
Share collection
View history
Collection guide
Browse collections