ajagota71
/

llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-20

Reinforcement Learning

text-generation

text-generation-inference

Model card Files Files and versions Community

llama-3-2-1b-rlhf-kl-p5-target-2p5-lr-3e-6-checkpoint-epoch-20 / tokenizer_config.json

Commit History

Checkpoint after epoch 20

fcb22ee
verified

ajagota71 commited on 14 days ago