scientific-RLHF

community

AI & ML interests

None defined yet.

Recent Activity

jiahaoq authored a paper 2 days ago

ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs

jiahaoq authored a paper 29 days ago

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences

jiahaoq authored a paper 29 days ago

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

View all activity

models 0

None public yet

datasets 0

None public yet