ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6 Reinforcement Learning • 1B • Updated 12 days ago • 2
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 12 days ago • 2
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 12 days ago • 2
ajagota71/llama-3-2-1b-rlhf-kl-p4-target-3-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 12 days ago • 1
ajagota71/ajagota71_pythia-70m-s-nlp-detox-checkpoint-epoch-100_10000_samples_detoxified Viewer • Updated 15 days ago • 10k • 100
ajagota71/ajagota71_pythia-70m-s-nlp-detox-checkpoint-epoch-100_2000_samples_detoxified Viewer • Updated 15 days ago • 2k • 101
ajagota71/ajagota71_pythia-70m-s-nlp-detox-checkpoint-epoch-100_500_samples_detoxified Viewer • Updated 15 days ago • 500 • 102
ajagota71/ajagota71_pythia-1b-detox-epoch-100_2000_samples_detoxified Viewer • Updated May 10 • 2k • 12
ajagota71/ajagota71_pythia-160m-detox-epoch-100_2000_samples_detoxified Viewer • Updated May 9 • 2k • 12