toloka-ai - a toloka Collection

toloka 's Collections

toloka-ai

updated Oct 27, 2023

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28