Shihan Dou's picture

11 8 6

Shihan Dou

Ablustrund

·

Ablustrund

AI & ML interests

Natural Language Processing, Large Language Models

Recent Activity

upvoted a paper 29 days ago

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

authored a paper about 1 month ago

Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback

authored a paper about 1 month ago

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

View all activity

Organizations

liked 4 datasets almost 2 years ago

vikp/evol_instruct_v2_filtered_109k

Viewer • Updated Aug 29, 2023 • 110k • 3 • 3

mrqa-workshop/mrqa

Viewer • Updated Jan 24, 2024 • 585k • 657 • 24

lucadiliello/naturalquestionsshortqa

Viewer • Updated Jun 6, 2023 • 117k • 34 • 3

openbmb/UltraFeedback

Viewer • Updated Dec 29, 2023 • 64k • 2.85k • 379

liked 2 models about 2 years ago

fnlp/moss-rlhf-sft-model-7B-en

Updated Jul 14, 2023 • 2

Ablustrund/moss-rlhf-reward-model-7B-zh

Updated Jul 13, 2023 • 23