xLAM-2 Collection A family of Large Action Model for multi-turn conversation and tool-use • 10 items • Updated May 5 • 16
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 868
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models Paper • 2405.20215 • Published May 30, 2024 • 1
Aligning Language Models Using Follow-up Likelihood as Reward Signal Paper • 2409.13948 • Published Sep 20, 2024 • 1
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24