kdhole's picture
Update README.md
1fb082f verified
metadata
license: openrail
datasets:
  - bookcorpus/bookcorpus
language:
  - en
base_model:
  - meta-llama/Llama-3.1-8B

This is the tuned lens version of LLama 3.1 8B (i.e. all the layers of the model have been trained to minimize the KL divergence with the last layer).

pip install tuned-lens
python -m tuned_lens train \
  --model.name meta-llama/Llama-3.1-8B \
  --data.name bookcorpus/bookcorpus \
  --per_gpu_batch_size=1 \
  --output my_lenses/meta-llama/Llama-3.1-8B
python -m tuned_lens eval \
  --data.name bookcorpus/bookcorpus \
  --model.name meta-llama/Llama-3.1-8B \
  --tokens 16400000 \
  --lens_name my_lenses/meta-llama/Llama-3.1-8B \
  --output evaluation/meta-llama/Llama-3.1-8B