This is the collection of synthetic preference data and trained reward models in "Cross-lingual Transfer of Reward Models in Multilingual Alignment".
IQWiki-XFACT
community
AI & ML interests
None defined yet.
Recent Activity
View all activity