Xiaoyang Cao's picture

1

Xiaoyang Cao

Sean13

·

https://xiaoyangcao1113.github.io/

AI & ML interests

RLFH, Deep Reinfrocement Learning

Recent Activity

updated a model 2 days ago

Sean13/mistral-7b-instruct-v0.2-ipo-full

updated a model 10 days ago

Sean13/mistral-7b-instruct-v0.2-slic_hf-full

published a model 13 days ago

Sean13/mistral-7b-instruct-v0.2-rslic_hf-full

View all activity

Organizations

None yet

models 7

Sean13/mistral-7b-instruct-v0.2-ipo-full

Text Generation • 7B • Updated 2 days ago • 147

Sean13/mistral-7b-instruct-v0.2-slic_hf-full

Text Generation • 7B • Updated 10 days ago • 16

Sean13/mistral-7b-instruct-v0.2-rslic_hf-full

Updated 13 days ago

Sean13/mistral-7b-instruct-v0.2-rsimpo-full

Text Generation • 7B • Updated 17 days ago • 82

Sean13/mistral-7b-instruct-v0.2-ripo-full

Text Generation • 7B • Updated 18 days ago • 19

Sean13/mistral-7b-instruct-v0.2-emdpo-full

7B • Updated 28 days ago • 39

Sean13/mistral-7b-instruct-v0.2-dpo-full

Text Generation • 7B • Updated Jul 20 • 21

datasets 0

None public yet