Elastic Reset Models and datasets for Elastic Reset (NeurIPS 2023), code at https://github.com/mnoukhov/elastic-reset mnoukhov/llama-7b-se-peft Updated Jun 1, 2023 mnoukhov/llama-7b-se-rl-peft Updated Oct 11, 2023 mnoukhov/llama-7b-se-rm-peft Updated Oct 11, 2023 HuggingFaceH4/stack-exchange-preferences Viewer • Updated Mar 8, 2023 • 10.8M • 1.96k • 133
Asynchronous RLHF Models and datasets for asynchronous rlhf paper, see code at https://github.com/mnoukhov/async_rlhf mnoukhov/pythia410m-sft-tldr Text Generation • 0.4B • Updated May 16, 2024 • 1.77k mnoukhov/pythia1b-sft-tldr Text Generation • 1B • Updated Jul 3, 2024 • 7 mnoukhov/pythia2.8b-sft-tldr Text Generation • 3B • Updated Jul 7, 2024 • 163 mnoukhov/pythia410m-rm-tldr6.9b Text Classification • 0.4B • Updated Jun 20, 2024 • 214
Elastic Reset Models and datasets for Elastic Reset (NeurIPS 2023), code at https://github.com/mnoukhov/elastic-reset mnoukhov/llama-7b-se-peft Updated Jun 1, 2023 mnoukhov/llama-7b-se-rl-peft Updated Oct 11, 2023 mnoukhov/llama-7b-se-rm-peft Updated Oct 11, 2023 HuggingFaceH4/stack-exchange-preferences Viewer • Updated Mar 8, 2023 • 10.8M • 1.96k • 133
Asynchronous RLHF Models and datasets for asynchronous rlhf paper, see code at https://github.com/mnoukhov/async_rlhf mnoukhov/pythia410m-sft-tldr Text Generation • 0.4B • Updated May 16, 2024 • 1.77k mnoukhov/pythia1b-sft-tldr Text Generation • 1B • Updated Jul 3, 2024 • 7 mnoukhov/pythia2.8b-sft-tldr Text Generation • 3B • Updated Jul 7, 2024 • 163 mnoukhov/pythia410m-rm-tldr6.9b Text Classification • 0.4B • Updated Jun 20, 2024 • 214