Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Wei Xiong's picture
14 18 19

Wei Xiong

weqweasdas
Trangle's profile picture qingyangzhang's profile picture mzhaoshuai's profile picture
·
https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

upvoted a paper 5 days ago
Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models
updated a dataset 19 days ago
weqweasdas/numina_prompt_non_dedu
published a dataset 19 days ago
weqweasdas/numina_prompt_non_dedu
View all activity

Organizations

reward modeling's profile picture raft_study's profile picture Directional Preference Alignment's profile picture RLHFlow's profile picture RRLHF's profile picture TIRData's profile picture feedbackagent's profile picture myselfrew's profile picture selfcorrexp's profile picture selfcorrexp2's profile picture mytestdpo's profile picture tmpmodelsave's profile picture qwselfcorr's profile picture dsrtrain's profile picture dsrselfcorr's profile picture ptllama's profile picture raftstudy's profile picture

authored 4 papers about 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs