Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Mila Iterative DPO
university
Activity Feed
Follow
3
AI & ML interests
None defined yet.
Recent Activity
arianhosseini
authored
a paper
7 days ago
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
arianhosseini
authored
a paper
7 days ago
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
arianhosseini
authored
a paper
7 days ago
Generative Verifiers: Reward Modeling as Next-Token Prediction
View all activity
Team members
3
MilaRLHF
's models
None public yet