Llama-3.1-8B-tuned-lens / README.md

kdhole

Update README.md

1fb082f verified 11 months ago

preview code

raw

history blame contribute delete

730 Bytes

metadata

license: openrail
datasets:
  - bookcorpus/bookcorpus
language:
  - en
base_model:
  - meta-llama/Llama-3.1-8B

This is the tuned lens version of LLama 3.1 8B (i.e. all the layers of the model have been trained to minimize the KL divergence with the last layer).

pip install tuned-lens

python -m tuned_lens train \
  --model.name meta-llama/Llama-3.1-8B \
  --data.name bookcorpus/bookcorpus \
  --per_gpu_batch_size=1 \
  --output my_lenses/meta-llama/Llama-3.1-8B

python -m tuned_lens eval \
  --data.name bookcorpus/bookcorpus \
  --model.name meta-llama/Llama-3.1-8B \
  --tokens 16400000 \
  --lens_name my_lenses/meta-llama/Llama-3.1-8B \
  --output evaluation/meta-llama/Llama-3.1-8B