Cross-lingual Transfer of Reward Models
Collection
This is the collection of synthetic preference data and trained reward models in "Cross-lingual Transfer of Reward Models in Multilingual Alignment".
•
5 items
•
Updated