Shihan Dou
Ablustrund
AI & ML interests
Natural Language Processing, Large Language Models
Recent Activity
authored
a paper
about 1 month ago
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning
from Human Feedback
authored
a paper
about 1 month ago
Improving Generalization of Alignment with Human Preferences through
Group Invariant Learning